NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|149056473|gb|EDM07904|]
View 

suppressor of Ty 5 homolog (S. cerevisiae), isoform CRA_b [Rattus norvegicus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
37-124 3.66e-44

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


:

Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 154.23  E-value: 3.66e-44
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  37 NLWTVKCKIGEERATAISLMRKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLgyWNQQMVPI 116
Cdd:cd09888    1 KLWAVKCKPGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVYL--NTIKLVPI 78

                 ....*...
gi 149056473 117 KEMTDVLK 124
Cdd:cd09888   79 KEMPDVLS 86
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
627-744 1.49e-28

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


:

Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 111.07  E-value: 1.49e-28
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473   627 GSQTPMYG-SGSRTPMYGSQTP----LQDGSRTPHYGSQTPLHDG--SRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEP 699
Cdd:smart01104   1 GGRTPAWGaSGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAWGGAG-PTGSRTPAWGGASAWGNKSSEG 79
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 149056473   700 TPSPQA--YGGTPNPQTPGYpdpssPQVNPQYNPQTPGTPAMYNTDQ 744
Cdd:smart01104  80 SASSWAagPGGAYGAPTPGY-----GGTPSAYGPATPGGGAMAGSAS 121
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
330-380 7.55e-28

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240507  Cd Length: 51  Bit Score: 106.46  E-value: 7.55e-28
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|.
gi 149056473 330 YFKMGDHVKVIAGRFEGDTGLIVRVEENFVILFSDLTMHELKVLPRDLQLC 380
Cdd:cd06083    1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
279-329 5.38e-26

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240506  Cd Length: 51  Bit Score: 101.04  E-value: 5.38e-26
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|.
gi 149056473 279 FQPGDNVEVCEGELINLQGKVLSVDGNKITIMPKHEDLKDMLEFPAQELRK 329
Cdd:cd06082    1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
557-606 5.03e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240509  Cd Length: 52  Bit Score: 98.33  E-value: 5.03e-25
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|
gi 149056473 557 DNELIGQTVRISQGPYKGYIGVVKDATESTARVELHSTCQTISVDRQRLT 606
Cdd:cd06085    2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLA 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
883-940 1.27e-24

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240510  Cd Length: 58  Bit Score: 97.20  E-value: 1.27e-24
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|....*...
gi 149056473 883 EHLEPITPTKNNKVKVILGEDREATGVLLSIDGEDGIIRMDlEDQQIKILNLRFLGKL 940
Cdd:cd06086    1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMD-SDGDIKILPMNFLAKL 57
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
456-498 7.22e-20

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240508  Cd Length: 43  Bit Score: 83.34  E-value: 7.22e-20
                         10        20        30        40
                 ....*....|....*....|....*....|....*....|...
gi 149056473 456 KDIVKVIDGPHSGREGEIRHLYRSFAFLHCKKLVENGGMFVCK 498
Cdd:cd06084    1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
135-172 1.19e-17

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


:

Pssm-ID: 240505  Cd Length: 38  Bit Score: 77.12  E-value: 1.19e-17
                         10        20        30
                 ....*....|....*....|....*....|....*...
gi 149056473 135 KSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMIPRIDY 172
Cdd:cd06081    1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
PHA03269 super family cl29788
envelope glycoprotein C; Provisional
680-825 5.93e-09

envelope glycoprotein C; Provisional


The actual alignment was detected with superfamily member PHA03269:

Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 59.74  E-value: 5.93e-09
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 680 NPNTPSRAEEEYEYAFDDEPTPSPQayggtPNPQTPGYPDPS-SPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQ 758
Cdd:PHA03269  21 NLNTNIPIPELHTSAATQKPDPAPA-----PHQAASRAPDPAvAPTSAASRKPD--------------LAQAPTPAASEK 81
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 149056473 759 PSPSPQSYHQV--APSPAGYQNTHSPasyhPTPSPM-----AYQASPSPSPVGYSPMTPgAPSPGGYNPHTPGS 825
Cdd:PHA03269  82 FDPAPAPHQAAsrAPDPAVAPQLAAA----PKPDAAeaftsAAQAHEAPADAGTSAASK-KPDPAAHTQHSPPP 150
 
Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
37-124 3.66e-44

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 154.23  E-value: 3.66e-44
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  37 NLWTVKCKIGEERATAISLMRKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLgyWNQQMVPI 116
Cdd:cd09888    1 KLWAVKCKPGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVYL--NTIKLVPI 78

                 ....*...
gi 149056473 117 KEMTDVLK 124
Cdd:cd09888   79 KEMPDVLS 86
Spt5-NGN pfam03439
Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG ...
37-123 5.75e-31

Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.


Pssm-ID: 397481  Cd Length: 84  Bit Score: 116.53  E-value: 5.75e-31
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473   37 NLWTVKCKIGEERATAISLMRKFIAYQfTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLGywNQQMVPI 116
Cdd:pfam03439   1 KIWAVKCTPGQEREVALSLMRKILALA-KTNNLGIYSVFAPDGLKGYIYVEADRQAAVKRALEGIPNVRGL--VPGLVPI 77

                  ....*..
gi 149056473  117 KEMTDVL 123
Cdd:pfam03439  78 KEMEHLL 84
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
627-744 1.49e-28

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 111.07  E-value: 1.49e-28
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473   627 GSQTPMYG-SGSRTPMYGSQTP----LQDGSRTPHYGSQTPLHDG--SRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEP 699
Cdd:smart01104   1 GGRTPAWGaSGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAWGGAG-PTGSRTPAWGGASAWGNKSSEG 79
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 149056473   700 TPSPQA--YGGTPNPQTPGYpdpssPQVNPQYNPQTPGTPAMYNTDQ 744
Cdd:smart01104  80 SASSWAagPGGAYGAPTPGY-----GGTPSAYGPATPGGGAMAGSAS 121
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
330-380 7.55e-28

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240507  Cd Length: 51  Bit Score: 106.46  E-value: 7.55e-28
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|.
gi 149056473 330 YFKMGDHVKVIAGRFEGDTGLIVRVEENFVILFSDLTMHELKVLPRDLQLC 380
Cdd:cd06083    1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
279-329 5.38e-26

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240506  Cd Length: 51  Bit Score: 101.04  E-value: 5.38e-26
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|.
gi 149056473 279 FQPGDNVEVCEGELINLQGKVLSVDGNKITIMPKHEDLKDMLEFPAQELRK 329
Cdd:cd06082    1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
557-606 5.03e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240509  Cd Length: 52  Bit Score: 98.33  E-value: 5.03e-25
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|
gi 149056473 557 DNELIGQTVRISQGPYKGYIGVVKDATESTARVELHSTCQTISVDRQRLT 606
Cdd:cd06085    2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLA 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
883-940 1.27e-24

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240510  Cd Length: 58  Bit Score: 97.20  E-value: 1.27e-24
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|....*...
gi 149056473 883 EHLEPITPTKNNKVKVILGEDREATGVLLSIDGEDGIIRMDlEDQQIKILNLRFLGKL 940
Cdd:cd06086    1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMD-SDGDIKILPMNFLAKL 57
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
456-498 7.22e-20

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240508  Cd Length: 43  Bit Score: 83.34  E-value: 7.22e-20
                         10        20        30        40
                 ....*....|....*....|....*....|....*....|...
gi 149056473 456 KDIVKVIDGPHSGREGEIRHLYRSFAFLHCKKLVENGGMFVCK 498
Cdd:cd06084    1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
135-172 1.19e-17

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240505  Cd Length: 38  Bit Score: 77.12  E-value: 1.19e-17
                         10        20        30
                 ....*....|....*....|....*....|....*...
gi 149056473 135 KSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMIPRIDY 172
Cdd:cd06081    1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
627-685 9.06e-15

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 69.78  E-value: 9.06e-15
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  627 GSQTPMYGS--GSRTPMY---GSQTPL--QDGSRTPHY--GSQTPLHD--GSRTPAQSGAWDPnnPNTPS 685
Cdd:pfam12815   1 GSRTPAYNSagGSRTPAWgadGSRTPAygGAGGRTPAYnqGGKTPAWGgaGSRTPAYYGAWGG--SRTPA 68
nusG PRK08559
transcription antitermination protein NusG; Validated
34-175 3.84e-12

transcription antitermination protein NusG; Validated


Pssm-ID: 181467 [Multi-domain]  Cd Length: 153  Bit Score: 64.89  E-value: 3.84e-12
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  34 KDPNLWTVKCKIGEERATAISLMRKFIAYQftdtpLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlGYwNQQM 113
Cdd:PRK08559   4 EMSMIFAVKTTAGQERNVALMLAMRAKKEN-----LPIYAILAPPELKGYVLVEAESKGAVEEAIRGIPHVR-GV-VPGE 76
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 149056473 114 VPIKEMTDVLKVVKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKM------IP---RIDYDRI 175
Cdd:PRK08559  77 ISFEEVEHFLKPKPIVEGIKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELleaavpIPvtvRGDQVRV 147
NGN smart00738
In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, ...
37-125 1.10e-10

In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, this domain may confer affinity for Spt4p.Spt4p


Pssm-ID: 197850 [Multi-domain]  Cd Length: 106  Bit Score: 59.31  E-value: 1.10e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473    37 NLWTVKCKIGEERATAISLMRKFIAYQFTDtplQIKSVVAP-EHVK----------------GYIYVEAYKQTHVKQAIE 99
Cdd:smart00738   1 NWYAVRTTSGQEKRVAENLERKAEALGLED---KIVSILVPtEEVKeirrgkkkvverklfpGYIFVEADLEDEVWTAIR 77
                           90       100       110
                   ....*....|....*....|....*....|
gi 149056473   100 GV----GNLRLGYWnQQMVPIKEMTDVLKV 125
Cdd:smart00738  78 GTpgvrGFVGGGGK-PTPVPDDEIEKILKP 106
PHA03269 PHA03269
envelope glycoprotein C; Provisional
680-825 5.93e-09

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 59.74  E-value: 5.93e-09
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 680 NPNTPSRAEEEYEYAFDDEPTPSPQayggtPNPQTPGYPDPS-SPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQ 758
Cdd:PHA03269  21 NLNTNIPIPELHTSAATQKPDPAPA-----PHQAASRAPDPAvAPTSAASRKPD--------------LAQAPTPAASEK 81
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 149056473 759 PSPSPQSYHQV--APSPAGYQNTHSPasyhPTPSPM-----AYQASPSPSPVGYSPMTPgAPSPGGYNPHTPGS 825
Cdd:PHA03269  82 FDPAPAPHQAAsrAPDPAVAPQLAAA----PKPDAAeaftsAAQAHEAPADAGTSAASK-KPDPAAHTQHSPPP 150
KOW_elon_Spt5 TIGR00405
transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial ...
39-167 2.42e-08

transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein.


Pssm-ID: 129499 [Multi-domain]  Cd Length: 145  Bit Score: 53.74  E-value: 2.42e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473   39 WTVKCKIGEERATAislmrKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlgywnqQMVP--- 115
Cdd:TIGR00405   1 FAVKTSVGQEKNVA-----RLMARKARKSGLEVYSILAPESLKGYILVEAETKIDMRNPIIGVPHVR------GVVEgei 69
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|...
gi 149056473  116 -IKEMTDVLKVVKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMI 167
Cdd:TIGR00405  70 dFEEIERFLTPKKIIESIKKGDIVEIISGPFKGERAKVIRVDESKEEVTLELI 122
PHA03247 PHA03247
large tegument protein UL36; Provisional
652-817 2.64e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 58.41  E-value: 2.64e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  652 SRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPsraeeeyeyafddePTPSPQAYGGTPN----PQTPGYPDPSSPQVNP 727
Cdd:PHA03247 2710 PAPHALVSATPLPPGPAAARQASPALPAAPAPP--------------AVPAGPATPGGPArparPPTTAGPPAPAPPAAP 2775
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  728 QYNPQTPGTPAMYNTDQFSPYAAPSPqgsyqPSPSPQSYHQVAPSPAgYQNTHSPASYHPTPsPMAYQASPSPSPVGYSP 807
Cdd:PHA03247 2776 AAGPPRRLTRPAVASLSESRESLPSP-----WDPADPPAAVLAPAAA-LPPAASPAGPLPPP-TSAQPTAPPPPPGPPPP 2848
                         170
                  ....*....|..
gi 149056473  808 MTP--GAPSPGG 817
Cdd:PHA03247 2849 SLPlgGSVAPGG 2860
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
699-825 7.34e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 56.70  E-value: 7.34e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  699 PTPSPQAYGGTPNPQTPGYPDPSSPQVN-PQYNPQTPGTP--AMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAG 775
Cdd:pfam03154 188 PPGTTQAATAGPTPSAPSVPPQGSPATSqPPNQTQSTAAPhtLIQQTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLP 267
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 149056473  776 YQNTHSPASYHP---------TPSPMAYQASPSPSPVGYSPMTPG----APSPGGYNPHTPGS 825
Cdd:pfam03154 268 QPSLHGQMPPMPhslqtgpshMQHPVPPQPFPLTPQSSQSQVPPGpspaAPGQSQQRIHTPPS 330
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
601-832 8.94e-07

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 52.76  E-value: 8.94e-07
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 601 DRQRLTTVDSQRPGGMtSTYGRTPMYGSQTPMYgsgsrTPMYGSQTPLQDGSRTPHYGSQTPLH-------DGSRTPaQS 673
Cdd:COG5180  154 LLQRSDPILAKDPDGD-SASTLPPPAEKLDKVL-----TEPRDALKDSPEKLDRPKVEVKDEAQeeppdltGGADHP-RP 226
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 674 GAWDPNNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPNP----QTPGYPDPSSPQVNPQYNPQT--PGTPAMYNTDQFS 746
Cdd:COG5180  227 EAASSPKVDPPSTSEARSRPATVDaQPEMRPPADAKERRRaaigDTPAAEPPGLPVLEAGSEPQSdaPEAETARPIDVKG 306
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 747 PYAAPSPQGSYQPSPS-----PQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSpspvgySPMTPGAPSPG--GYN 819
Cdd:COG5180  307 VASAPPATRPVRPPGGardpgTPRPGQPTERPAGVPEAASDAGQPPSAYPPAEEAVPG------KPLEQGAPRPGssGGD 380
                        250
                 ....*....|....*
gi 149056473 820 --PHTPGSGIEQNSS 832
Cdd:COG5180  381 gaPFQPPNGAPQPGL 395
SP7_N cd22542
N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins ...
657-825 3.74e-05

N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7.


Pssm-ID: 411691 [Multi-domain]  Cd Length: 297  Bit Score: 46.82  E-value: 3.74e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 657 YGSQTPLHDgSRTPAQSGAWDPNNPNTP--------SRAEE----EYEYAFDD-----EPTPSPQA---YGGTPNPQTPG 716
Cdd:cd22542   26 FGGSSPIRD-SATPGKPGNNPGKKPYSLgsdlssakSRSSElmgdSYTATFSSgnglmSPSGSPQAsttYGNDYNPFSHS 104
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 717 YPDPSSPQ----VNPQYNPQTPGTPAMYNT-DQFSPY-----AAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYH 786
Cdd:cd22542  105 FPTSSGSQdpslLVSKGHPSADCLPSVYTSlDMAHPYgswykTGIHPGISSSSTNATASWWDMHSNTNWLSAQGQPDGLQ 184
                        170       180       190
                 ....*....|....*....|....*....|....*....
gi 149056473 787 PTPSPMAYQASPSPSPVGYSPMTPgaPSPGGYNPHTPGS 825
Cdd:cd22542  185 ASLQPVPAQTPLNPQLPSYTEFTT--LNPAPYPAVGISS 221
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
330-357 4.74e-05

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 41.16  E-value: 4.74e-05
                           10        20
                   ....*....|....*....|....*...
gi 149056473   330 YFKMGDHVKVIAGRFEGDTGLIVRVEEN 357
Cdd:smart00739   1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
561-592 8.85e-05

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 40.45  E-value: 8.85e-05
                          10        20        30
                  ....*....|....*....|....*....|..
gi 149056473  561 IGQTVRISQGPYKGYIGVVKDATESTARVELH 592
Cdd:pfam00467   1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
679-815 2.04e-04

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 45.14  E-value: 2.04e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 679 NNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPN-PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQG- 755
Cdd:NF033839 249 DNVNTKVEIENTVHKIFADmDAVVTKFKKGLTQDtPKEPGNKKPSAPKPGMQPSPQPEKKEVKPEPETPKPEVKPQLEKp 328
                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 149056473 756 SYQPSPSPQSYH-QVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSpvgySPMTPGAPSP 815
Cdd:NF033839 329 KPEVKPQPEKPKpEVKPQLETPKPEVKPQPEKPKPEVKPQPEKPKPE----VKPQPETPKP 385
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
334-362 2.06e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 39.29  E-value: 2.06e-04
                          10        20        30
                  ....*....|....*....|....*....|.
gi 149056473  334 GDHVKVIAGRFEGDTGLIVRVEE--NFVILF 362
Cdd:pfam00467   2 GDVVRVIAGPFKGKVGKVVEVDDkkNRVLVE 32
KLF1_2_4_N cd21972
N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel ...
684-823 1.10e-03

N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins.


Pssm-ID: 409230 [Multi-domain]  Cd Length: 194  Bit Score: 41.12  E-value: 1.10e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 684 PSRAEEEYEYAFDDEPTPSPQAYG----GTPNPQTPGYPDPSSPQVNPQYN-PQTPGTPAMYNTDQFSPYAAPSPQGSYQ 758
Cdd:cd21972   22 LDLEFILSNTVTSDNDNPPPPDPAypppESPESCSTVYDSDGCHPTPNAYCgPNGPGLPGHFLLAGNSPNLGPKIKTENQ 101
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 149056473 759 PS-------PSPQSYHQVAPS------PAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPmtPGAPSPGGYNPHTP 823
Cdd:cd21972  102 EQacmpvagYSGHYGPREPQRvppappPPQYAGHFQYHGHFNMFSPPLRANHPGMSTVMLTP--LSTPPLGFLSPEEA 177
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
697-823 3.44e-03

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 41.29  E-value: 3.44e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 697 DEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQynPQTPGTPAMYNTDQFSPYAAPSPQ-GSYQPSPSPQSYH-QVAPSPA 774
Cdd:NF033839 370 EKPKPEVKPQPETPKPEVKPQPEKPKPEVKPQ--PEKPKPEVKPQPEKPKPEVKPQPEkPKPEVKPQPEKPKpEVKPQPE 447
                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|...
gi 149056473 775 GYQNTHSPASYHPTPSPMAYQASPSPSpVGYSPMTP----GAPSPGGYNPHTP 823
Cdd:NF033839 448 KPKPEVKPQPETPKPEVKPQPEKPKPE-VKPQPEKPkpdnSKPQADDKKPSTP 499
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
455-486 5.18e-03

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 35.44  E-value: 5.18e-03
                          10        20        30
                  ....*....|....*....|....*....|..
gi 149056473  455 VKDIVKVIDGPHSGREGEIRHLYRSFAFLHCK 486
Cdd:pfam00467   1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
 
Name Accession Description Interval E-value
NGN_Euk cd09888
Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW ...
37-124 3.66e-44

Eukaryotic N-Utilization Substance G (NusG) N-terminal (NGN) domain, including plant KTF1 (KOW domain-containing Transcription Factor 1); The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus. Spt5 forms an Spt4-Spt5 complex that is an essential RNA polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli, and has a variety of functions such as its involvement in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but their functions and requirements are different. Spt5-like is homologous to the Spt5 proteins present in all eukaryotes, which is unique as it encodes a protein with an additional long carboxy-terminal extension that contains WG/GW motifs. Spt5-like, or KTF1 (KOW domain-containing Transcription Factor 1), is a RNA-directed DNA methylation (RdDM) pathway effector in plants.


Pssm-ID: 193577 [Multi-domain]  Cd Length: 86  Bit Score: 154.23  E-value: 3.66e-44
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  37 NLWTVKCKIGEERATAISLMRKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLgyWNQQMVPI 116
Cdd:cd09888    1 KLWAVKCKPGKEREIVISLMRKFLDLQRTGNPLGIKSVFARDGLKGYIYIEARKEAHVKDAIEGLRGVYL--NTIKLVPI 78

                 ....*...
gi 149056473 117 KEMTDVLK 124
Cdd:cd09888   79 KEMPDVLS 86
Spt5-NGN pfam03439
Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG ...
37-123 5.75e-31

Early transcription elongation factor of RNA pol II, NGN section; Spt5p and prokaryotic NusG are shown to contain a novel 'NGN' domain. The combined NGN and KOW motif regions of Spt5 form the binding domain with Spt4. Spt5 complexes with Spt4 as a 1:1 heterodimer snf this Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The Schizosaccharomyces pombe core Spt5-Spt4 complex is a heterodimer bearing a trypsin-resistant Spt4-binding domain within the Spt5 subunit.


Pssm-ID: 397481  Cd Length: 84  Bit Score: 116.53  E-value: 5.75e-31
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473   37 NLWTVKCKIGEERATAISLMRKFIAYQfTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLGywNQQMVPI 116
Cdd:pfam03439   1 KIWAVKCTPGQEREVALSLMRKILALA-KTNNLGIYSVFAPDGLKGYIYVEADRQAAVKRALEGIPNVRGL--VPGLVPI 77

                  ....*..
gi 149056473  117 KEMTDVL 123
Cdd:pfam03439  78 KEMEHLL 84
CTD smart01104
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
627-744 1.49e-28

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif.


Pssm-ID: 215026 [Multi-domain]  Cd Length: 121  Bit Score: 111.07  E-value: 1.49e-28
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473   627 GSQTPMYG-SGSRTPMYGSQTP----LQDGSRTPHYGSQTPLHDG--SRTPAQSGAWdPNNPNTPSRAEEEYEYAFDDEP 699
Cdd:smart01104   1 GGRTPAWGaSGSKTPAWGSRTPgtaaGGAPTARGGSGSRTPAWGGagSRTPAWGGAG-PTGSRTPAWGGASAWGNKSSEG 79
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 149056473   700 TPSPQA--YGGTPNPQTPGYpdpssPQVNPQYNPQTPGTPAMYNTDQ 744
Cdd:smart01104  80 SASSWAagPGGAYGAPTPGY-----GGTPSAYGPATPGGGAMAGSAS 121
KOW_Spt5_3 cd06083
KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
330-380 7.55e-28

KOW domain of Spt5, repeat 3; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240507  Cd Length: 51  Bit Score: 106.46  E-value: 7.55e-28
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|.
gi 149056473 330 YFKMGDHVKVIAGRFEGDTGLIVRVEENFVILFSDLTMHELKVLPRDLQLC 380
Cdd:cd06083    1 HFKVGDHVKVISGRHEGETGLVVKVEDDVVTVFSDLTMRELKVFPRDLQLS 51
KOW_Spt5_2 cd06082
KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
279-329 5.38e-26

KOW domain of Spt5, repeat 2; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240506  Cd Length: 51  Bit Score: 101.04  E-value: 5.38e-26
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|.
gi 149056473 279 FQPGDNVEVCEGELINLQGKVLSVDGNKITIMPKHEDLKDMLEFPAQELRK 329
Cdd:cd06082    1 FQPGDNVEVIEGELKGLQGKVESVDGDIVTIMPKHEDLKEPLEFPAKELRK 51
KOW_Spt5_5 cd06085
KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
557-606 5.03e-25

KOW domain of Spt5, repeat 5; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240509  Cd Length: 52  Bit Score: 98.33  E-value: 5.03e-25
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|
gi 149056473 557 DNELIGQTVRISQGPYKGYIGVVKDATESTARVELHSTCQTISVDRQRLT 606
Cdd:cd06085    2 RDPLIGKTVRIRKGPYKGYIGIVKDATGTTARVELHSKNKTITVDRSRLA 51
KOW_Spt5_6 cd06086
KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
883-940 1.27e-24

KOW domain of Spt5, repeat 6; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240510  Cd Length: 58  Bit Score: 97.20  E-value: 1.27e-24
                         10        20        30        40        50
                 ....*....|....*....|....*....|....*....|....*....|....*...
gi 149056473 883 EHLEPITPTKNNKVKVILGEDREATGVLLSIDGEDGIIRMDlEDQQIKILNLRFLGKL 940
Cdd:cd06086    1 EHLEPVPPEKGDRVKVIKGEDRGSTGELISIDGADGIVKMD-SDGDIKILPMNFLAKL 57
KOW_Spt5_4 cd06084
KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
456-498 7.22e-20

KOW domain of Spt5, repeat 4; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240508  Cd Length: 43  Bit Score: 83.34  E-value: 7.22e-20
                         10        20        30        40
                 ....*....|....*....|....*....|....*....|...
gi 149056473 456 KDIVKVIDGPHSGREGEIRHLYRSFAFLHCKKLVENGGMFVCK 498
Cdd:cd06084    1 GDTVKVVDGPYKGRQGTVLHIYRGTLFLHSREVTENGGIFVVR 43
KOW_Spt5_1 cd06081
KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW ...
135-172 1.19e-17

KOW domain of Spt5, repeat 1; Spt5, an eukaryotic ortholog of NusG, contains multiple KOW motifs at its C-terminus. Spt5 is involved in transcription elongation and termination. KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. KOW_Spt5 domains play critical roles in recruitment of multiple other eukaryotic transcription elongation and RNA biogenesis factors and additionally are involved in the binding of the eukaryotic Spt5 proteins to RNA polymerases.


Pssm-ID: 240505  Cd Length: 38  Bit Score: 77.12  E-value: 1.19e-17
                         10        20        30
                 ....*....|....*....|....*....|....*...
gi 149056473 135 KSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMIPRIDY 172
Cdd:cd06081    1 GSWVRIKRGIYKGDLAQVDEVDENGNRVVVKLIPRIDY 38
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
627-685 9.06e-15

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 69.78  E-value: 9.06e-15
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  627 GSQTPMYGS--GSRTPMY---GSQTPL--QDGSRTPHY--GSQTPLHD--GSRTPAQSGAWDPnnPNTPS 685
Cdd:pfam12815   1 GSRTPAYNSagGSRTPAWgadGSRTPAygGAGGRTPAYnqGGKTPAWGgaGSRTPAYYGAWGG--SRTPA 68
nusG PRK08559
transcription antitermination protein NusG; Validated
34-175 3.84e-12

transcription antitermination protein NusG; Validated


Pssm-ID: 181467 [Multi-domain]  Cd Length: 153  Bit Score: 64.89  E-value: 3.84e-12
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  34 KDPNLWTVKCKIGEERATAISLMRKFIAYQftdtpLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlGYwNQQM 113
Cdd:PRK08559   4 EMSMIFAVKTTAGQERNVALMLAMRAKKEN-----LPIYAILAPPELKGYVLVEAESKGAVEEAIRGIPHVR-GV-VPGE 76
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 149056473 114 VPIKEMTDVLKVVKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKM------IP---RIDYDRI 175
Cdd:PRK08559  77 ISFEEVEHFLKPKPIVEGIKEGDIVELIAGPFKGEKARVVRVDESKEEVTVELleaavpIPvtvRGDQVRV 147
NGN cd08000
N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization ...
37-123 7.52e-11

N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization Substance G (NusG) and its eukaryotic homolog Spt5 are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms a Spt4-Spt5 complex that is an essential RNA Polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The diverse activities suggest that, after diverging from a common ancestor, NusG proteins became specialized in different bacteria.


Pssm-ID: 193574 [Multi-domain]  Cd Length: 99  Bit Score: 59.64  E-value: 7.52e-11
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  37 NLWTVKCKIGEERATAISLMRKFIA---------YQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRLG 107
Cdd:cd08000    1 NWYVLFVKTGREEKVEKLLEKRFEAndieafvpkKEVPERKRGKIEEVIKPLFPGYVFVETDLSPELYELIREVPGVIGI 80
                         90
                 ....*....|....*....
gi 149056473 108 YWN---QQMVPIKEMTDVL 123
Cdd:cd08000   81 LGNgeePSPVSDEEIEMIL 99
NGN smart00738
In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, ...
37-125 1.10e-10

In Spt5p, this domain may confer affinity for Spt4p. It possesses a RNP-like fold; In Spt5p, this domain may confer affinity for Spt4p.Spt4p


Pssm-ID: 197850 [Multi-domain]  Cd Length: 106  Bit Score: 59.31  E-value: 1.10e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473    37 NLWTVKCKIGEERATAISLMRKFIAYQFTDtplQIKSVVAP-EHVK----------------GYIYVEAYKQTHVKQAIE 99
Cdd:smart00738   1 NWYAVRTTSGQEKRVAENLERKAEALGLED---KIVSILVPtEEVKeirrgkkkvverklfpGYIFVEADLEDEVWTAIR 77
                           90       100       110
                   ....*....|....*....|....*....|
gi 149056473   100 GV----GNLRLGYWnQQMVPIKEMTDVLKV 125
Cdd:smart00738  78 GTpgvrGFVGGGGK-PTPVPDDEIEKILKP 106
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
644-825 2.76e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 61.32  E-value: 2.76e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  644 SQTPLQDGSRTPHYGSQTPLHDgsrtPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPtPSPQAYGGTPNPQTPGYPdPSSP 723
Cdd:pfam03154 259 SQVSPQPLPQPSLHGQMPPMPH----SLQTGPSHMQHPVPPQPFPLTPQSSQSQVP-PGPSPAAPGQSQQRIHTP-PSQS 332
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  724 QVNPQYNP-QTPGTPAMYNTdqfsPYAAPSPQGSYQPSPSPQSY----HQVAPSPagYQ-----------------NTHS 781
Cdd:pfam03154 333 QLQSQQPPrEQPLPPAPLSM----PHIKPPPTTPIPQLPNPQSHkhppHLSGPSP--FQmnsnlppppalkplsslSTHH 406
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*..
gi 149056473  782 PASYHPTPSPMAYQASPSPSPVGYSPM---TPGAPSPGGYNPHTPGS 825
Cdd:pfam03154 407 PPSAHPPPLQLMPQSQQLPPPPAQPPVltqSQSLPPPAASHPPTSGL 453
PHA03269 PHA03269
envelope glycoprotein C; Provisional
680-825 5.93e-09

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 59.74  E-value: 5.93e-09
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 680 NPNTPSRAEEEYEYAFDDEPTPSPQayggtPNPQTPGYPDPS-SPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQ 758
Cdd:PHA03269  21 NLNTNIPIPELHTSAATQKPDPAPA-----PHQAASRAPDPAvAPTSAASRKPD--------------LAQAPTPAASEK 81
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 149056473 759 PSPSPQSYHQV--APSPAGYQNTHSPasyhPTPSPM-----AYQASPSPSPVGYSPMTPgAPSPGGYNPHTPGS 825
Cdd:PHA03269  82 FDPAPAPHQAAsrAPDPAVAPQLAAA----PKPDAAeaftsAAQAHEAPADAGTSAASK-KPDPAAHTQHSPPP 150
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
631-832 6.85e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 59.78  E-value: 6.85e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  631 PMYGSGSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPsrAEEEYEYAFDDEPTPSPQayggTP 710
Cdd:pfam03154 294 PPQPFPLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPP--APLSMPHIKPPPTTPIPQ----LP 367
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  711 NPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQgsyqPSP---SPQSyHQVAPSPAG------YQNTHS 781
Cdd:pfam03154 368 NPQSHKHPPHLSGPSPFQMNSNLPPPPALKPLSSLSTHHPPSAH----PPPlqlMPQS-QQLPPPPAQppvltqSQSLPP 442
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*..
gi 149056473  782 PASYHPTPSpmAYQASPSPSPVGYSPMTPGAP----SPGGYNPHTP--GSGIEQNSS 832
Cdd:pfam03154 443 PAASHPPTS--GLHQVPSQSPFPQHPFVPGGPppitPPSGPPTSTSsaMPGIQPPSS 497
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
582-901 8.66e-09

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 59.20  E-value: 8.66e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  582 ATESTARVELHSTCQTISvdrqrlTTVDSQRPGGMTSTYGRT--PMYGSQTPMYGSGsrTPMYGSQTPlQDGSRTPHYGS 659
Cdd:pfam17823 170 AASPAPRTAASSTTAASS------TTAASSAPTTAASSAPATltPARGISTAATATG--HPAAGTALA-AVGNSSPAAGT 240
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  660 QTPLhDGSRTPAQSGawdpnnpnTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGypDPSSPQVNPQYNPQTPGTPAM 739
Cdd:pfam17823 241 VTAA-VGTVTPAALA--------TLAAAAGTVASAAGTINMGDPHARRLSPAKHMPS--DTMARNPAAPMGAQAQGPIIQ 309
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  740 YNTDQfsPYAAPSPqgsyQPSPSPQSYHQVAPSPAGYQNTHSPASyhPTPSPMAYQASPSPSPVGYSPMTPGA------- 812
Cdd:pfam17823 310 VSTDQ--PVHNTAG----EPTPSPSNTTLEPNTPKSVASTNLAVV--TTTKAQAKEPSASPVPVLHTSMIPEVeatsptt 381
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  813 -PSPGGYNPHTPGSGIEQNSSdwvttdiQVKVRDTyLDTQIVGQT----GVIRSVTGGMCSVYLKDSEKVVSIssehlEP 887
Cdd:pfam17823 382 qPSPLLPTQGAAGPGILLAPE-------QVATEAT-AGTASAGPTprssGDPKTLAMASCQLSTQGQYLVVTT-----DP 448
                         330
                  ....*....|....*...
gi 149056473  888 ITPTKNNK----VKVILG 901
Cdd:pfam17823 449 LTPALVDKmfllVVLILG 466
CTD pfam12815
Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription ...
614-674 1.32e-08

Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteriztic TPA motif.


Pssm-ID: 372327 [Multi-domain]  Cd Length: 71  Bit Score: 52.45  E-value: 1.32e-08
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 149056473  614 GGMTSTYG----RTPMYG---SQTPMYGSGSRTPMYGsqtplQDGSRTPHYGSQTplhDGSRTPAQSG 674
Cdd:pfam12815  12 GSRTPAWGadgsRTPAYGgagGRTPAYNQGGKTPAWG-----GAGSRTPAYYGAW---GGSRTPAYGG 71
KOW_elon_Spt5 TIGR00405
transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial ...
39-167 2.42e-08

transcription elongation factor Spt5; This protein contains a KOW domain, shared by bacterial NusG and the uL24 (previously L24p/L26e) family of ribosomal proteins. The most recent papers and crystal structures make this a transcription elongation factor rather than a ribosomal protein.


Pssm-ID: 129499 [Multi-domain]  Cd Length: 145  Bit Score: 53.74  E-value: 2.42e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473   39 WTVKCKIGEERATAislmrKFIAYQFTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLRlgywnqQMVP--- 115
Cdd:TIGR00405   1 FAVKTSVGQEKNVA-----RLMARKARKSGLEVYSILAPESLKGYILVEAETKIDMRNPIIGVPHVR------GVVEgei 69
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|...
gi 149056473  116 -IKEMTDVLKVVKEVANLKPKSWVRLKRGIYKDDIAQVDYVEPSQNTISLKMI 167
Cdd:TIGR00405  70 dFEEIERFLTPKKIIESIKKGDIVEIISGPFKGERAKVIRVDESKEEVTLELI 122
PHA03247 PHA03247
large tegument protein UL36; Provisional
652-817 2.64e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 58.41  E-value: 2.64e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  652 SRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPsraeeeyeyafddePTPSPQAYGGTPN----PQTPGYPDPSSPQVNP 727
Cdd:PHA03247 2710 PAPHALVSATPLPPGPAAARQASPALPAAPAPP--------------AVPAGPATPGGPArparPPTTAGPPAPAPPAAP 2775
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  728 QYNPQTPGTPAMYNTDQFSPYAAPSPqgsyqPSPSPQSYHQVAPSPAgYQNTHSPASYHPTPsPMAYQASPSPSPVGYSP 807
Cdd:PHA03247 2776 AAGPPRRLTRPAVASLSESRESLPSP-----WDPADPPAAVLAPAAA-LPPAASPAGPLPPP-TSAQPTAPPPPPGPPPP 2848
                         170
                  ....*....|..
gi 149056473  808 MTP--GAPSPGG 817
Cdd:PHA03247 2849 SLPlgGSVAPGG 2860
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
699-825 7.34e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 56.70  E-value: 7.34e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  699 PTPSPQAYGGTPNPQTPGYPDPSSPQVN-PQYNPQTPGTP--AMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAG 775
Cdd:pfam03154 188 PPGTTQAATAGPTPSAPSVPPQGSPATSqPPNQTQSTAAPhtLIQQTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLP 267
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 149056473  776 YQNTHSPASYHP---------TPSPMAYQASPSPSPVGYSPMTPG----APSPGGYNPHTPGS 825
Cdd:pfam03154 268 QPSLHGQMPPMPhslqtgpshMQHPVPPQPFPLTPQSSQSQVPPGpspaAPGQSQQRIHTPPS 330
NGN_Arch cd09887
Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain; The N-Utilization Substance ...
38-105 7.78e-08

Archaeal N-Utilization Substance G (NusG) N-terminal (NGN) domain; The N-Utilization Substance G (NusG) protein and its eukaryotic homolog, Spt5, are involved in transcription elongation and termination. Transcription in archaea has a eukaryotic-type transcription apparatus, but contains bacterial-type transcription factors. NusG is one of the few archaeal transcription factors that has orthologs in both bacteria and eukaryotes. Archaeal NusG is similar to bacterial NusG, composed of an NGN domain and a Kyrpides Ouzounis and Woese (KOW) repeat. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. NusG was originally discovered as a N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Archaeal NusG forms a complex with DNA-directed RNA polymerase subunit E (rpoE) that is similar to the Spt5-Spt4 complex in eukaryotes.


Pssm-ID: 193576  Cd Length: 82  Bit Score: 50.62  E-value: 7.78e-08
                         10        20        30        40        50        60
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 149056473  38 LWTVKCKIGEERATAISLMRKFiayqfTDTPLQIKSVVAPEHVKGYIYVEAYKQTHVKQAIEGVGNLR 105
Cdd:cd09887    2 IYAVKTTAGQERNVADLLAMRA-----EKENLDVYSILVPEELKGYVFVEAEDPDRVEELIRGIPHVR 64
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
613-826 8.41e-08

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 56.17  E-value: 8.41e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  613 PGGMTSTYGRTPMYGSQTPMYGSGSRTPMYGSQ-------TPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNP---- 681
Cdd:pfam09606 231 PQQMGGAPNQVAMQQQQPQQQGQQSQLGMGINQmqqmpqgVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYqqqq 310
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  682 --NTPSRAEEEYEYAFDDEPTPS--------PQAYGGTPNPQTPGypdpssPQVNPQYNPQTPGTPAMYNTDQFSPYAAP 751
Cdd:pfam09606 311 trQQQQQQGGNHPAAHQQQMNQSvgqggqvvALGGLNHLETWNPG------NFGGLGANPMQRGQPGMMSSPSPVPGQQV 384
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 149056473  752 SPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASyHPTPSPmAYQASPSPSPVGY--SPMTPGAPSPGGyNPHTPGSG 826
Cdd:pfam09606 385 RQVTPNQFMRQSPQPSVPSPQGPGSQPPQSHPG-GMIPSP-ALIPSPSPQMSQQpaQQRTIGQDSPGG-SLNTPGQS 458
PHA03377 PHA03377
EBNA-3C; Provisional
593-823 1.00e-07

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 56.21  E-value: 1.00e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  593 STCQTISVDRQRLTTVDSQRPGGMTSTygrTPMYGSQTPMYgSGSRTPMYGSQTPLQD---GSRTPHYGSQTPLHDGSRT 669
Cdd:PHA03377  686 SVFVLPSVDAGRAQPSEESHLSSMSPT---QPISHEEQPRY-EDPDDPLDLSLHPDQApppSHQAPYSGHEEPQAQQAPY 761
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  670 PaqsGAWDPNNPNTPSRAEEEyeyafddeptpsPQAYGGTPNpQTPGY--PDPSSPQvNPQY--------------NPQT 733
Cdd:PHA03377  762 P---GYWEPRPPQAPYLGYQE------------PQAQGVQVS-SYPGYagPWGLRAQ-HPRYrhswaywsqypghgHPQG 824
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  734 PGTP-AMYNTDQFSPYAAP-----SPQGSYQPSPSP----QSYHQVAPSPAGYQNTHSPASYHPTPS----PMAYQASPS 799
Cdd:PHA03377  825 PWAPrPPHLPPQWDGSAGHgqdqvSQFPHLQSETGPprlqLSQVPQLPYSQTLVSSSAPSWSSPQPRapirPIPTRFPPP 904
                         250       260
                  ....*....|....*....|....
gi 149056473  800 PSPVGYSpMTPGAPSPGGYNPHTP 823
Cdd:PHA03377  905 PMPLQDS-MAVGCDSSGTACPSMP 927
PHA03247 PHA03247
large tegument protein UL36; Provisional
617-837 1.53e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 55.71  E-value: 1.53e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  617 TSTYGRTPMYGSQTPMygsgSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSR-TPAQSGAWDPNNPNTPsRAEEEYEYAF 695
Cdd:PHA03247 2817 ALPPAASPAGPLPPPT----SAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRrPPSRSPAAKPAAPARP-PVRRLARPAV 2891
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  696 DDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQtpgtpamyntdqfsPYAAPSPQGSYQPSPSPQSYHQVAPSPAG 775
Cdd:PHA03247 2892 SRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQ--------------PQPPPPPPPRPQPPLAPTTDPAGAGEPSG 2957
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 149056473  776 YQNTHSPASYHPTPSPMAYQASPSPSPvgySPMTPGAPSPGgyNPHTPGSGIeqnsSDWVTT 837
Cdd:PHA03247 2958 AVPQPWLGALVPGRVAVPRFRVPQPAP---SREAPASSTPP--LTGHSLSRV----SSWASS 3010
PHA03378 PHA03378
EBNA-3B; Provisional
610-817 1.98e-07

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 55.07  E-value: 1.98e-07
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 610 SQRPGGMT--STYGRTPMYGSQTPMYGSGSRTPMYGSQTPLQD-----------GSRTPHYGSQTPLHDGSRTPAQSGAW 676
Cdd:PHA03378 593 AQTPWPVPhpSQTPEPPTTQSHIPETSAPRQWPMPLRPIPMRPlrmqpitfnvlVFPTPHQPPQVEITPYKPTWTQIGHI 672
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 677 dPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPgyPDPSS-PQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQG 755
Cdd:PHA03378 673 -PYQPSPTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAP--PGRAQrPAAATGRARPPAAAPGRARPPAAAPGRARPPAA 749
                        170       180       190       200       210       220       230
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 149056473 756 SYQPSPSPQSYHQVAPSPAGyqnthSPASYHPTPSPMA-------YQASPSPSP---VGYSPMTPGAPSPGG 817
Cdd:PHA03378 750 APGRARPPAAAPGRARPPAA-----APGAPTPQPPPQAppapqqrPRGAPTPQPppqAGPTSMQLMPRAAPG 816
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
282-328 2.52e-07

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 47.98  E-value: 2.52e-07
                         10        20        30        40
                 ....*....|....*....|....*....|....*....|....*....
gi 149056473 282 GDNVEVCEGELINLQGKVLSVDG--NKITIMPKHEDLKDMLEFPAQELR 328
Cdd:cd00380    1 GDVVRVLRGPYKGREGVVVDIDPrfGIVTVKGATGSKGAELKVRFDDVD 49
PHA03247 PHA03247
large tegument protein UL36; Provisional
644-826 7.37e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.40  E-value: 7.37e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  644 SQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYP----- 718
Cdd:PHA03247 2567 SVPPPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHppptv 2646
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  719 --------DPSSPQVNPQYNPQTPGTPAMYN--TDQFSPYAAPSPQGSY-----------QPSPSPQSYHQVAPSPAGYQ 777
Cdd:PHA03247 2647 ppperprdDPAPGRVSRPRRARRLGRAAQASspPQRPRRRAARPTVGSLtsladppppppTPEPAPHALVSATPLPPGPA 2726
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....
gi 149056473  778 NTHS-----PASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGynPHTPGSG 826
Cdd:PHA03247 2727 AARQaspalPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAP--PAAPAAG 2778
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
601-832 8.94e-07

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 52.76  E-value: 8.94e-07
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 601 DRQRLTTVDSQRPGGMtSTYGRTPMYGSQTPMYgsgsrTPMYGSQTPLQDGSRTPHYGSQTPLH-------DGSRTPaQS 673
Cdd:COG5180  154 LLQRSDPILAKDPDGD-SASTLPPPAEKLDKVL-----TEPRDALKDSPEKLDRPKVEVKDEAQeeppdltGGADHP-RP 226
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 674 GAWDPNNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPNP----QTPGYPDPSSPQVNPQYNPQT--PGTPAMYNTDQFS 746
Cdd:COG5180  227 EAASSPKVDPPSTSEARSRPATVDaQPEMRPPADAKERRRaaigDTPAAEPPGLPVLEAGSEPQSdaPEAETARPIDVKG 306
                        170       180       190       200       210       220       230       240
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 747 PYAAPSPQGSYQPSPS-----PQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSpspvgySPMTPGAPSPG--GYN 819
Cdd:COG5180  307 VASAPPATRPVRPPGGardpgTPRPGQPTERPAGVPEAASDAGQPPSAYPPAEEAVPG------KPLEQGAPRPGssGGD 380
                        250
                 ....*....|....*
gi 149056473 820 --PHTPGSGIEQNSS 832
Cdd:COG5180  381 gaPFQPPNGAPQPGL 395
PRK10263 PRK10263
DNA translocase FtsK; Provisional
698-824 9.13e-07

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 53.17  E-value: 9.13e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  698 EPTPSPQAYGGTPNPQtpgYPDPSSPQVNP---QYNPQTPGTPAMYNTDQFSPYAAPSP-QGSYQPSPSPQSYHQVAPSP 773
Cdd:PRK10263  370 EPVIAPAPEGYPQQSQ---YAQPAVQYNEPlqqPVQPQQPYYAPAAEQPAQQPYYAPAPeQPAQQPYYAPAPEQPVAGNA 446
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|.
gi 149056473  774 AGYQNTHSPasYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPG 824
Cdd:PRK10263  447 WQAEEQQST--FAPQSTYQTEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPV 495
PHA03378 PHA03378
EBNA-3B; Provisional
654-823 1.09e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 52.76  E-value: 1.09e-06
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 654 TPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEE---YEYAFDDEPTPSPQAYGGTPNPQTPGYPDPS-SPQVNPQ- 728
Cdd:PHA03378 582 TSQLASSAPSYAQTPWPVPHPSQTPEPPTTQSHIPETsapRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHqPPQVEITp 661
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 729 ------------YNPQTPG---------TPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHP 787
Cdd:PHA03378 662 ykptwtqighipYQPSPTGantmlpiqwAPGTMQPPPRAPTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAP 741
                        170       180       190
                 ....*....|....*....|....*....|....*...
gi 149056473 788 TPS--PMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTP 823
Cdd:PHA03378 742 GRArpPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPP 779
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
562-605 2.90e-06

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 44.90  E-value: 2.90e-06
                         10        20        30        40
                 ....*....|....*....|....*....|....*....|....*...
gi 149056473 562 GQTVRISQGPYKGYIGVVKDATEST--ARVELH--STCQTISVDRQRL 605
Cdd:cd00380    1 GDVVRVLRGPYKGREGVVVDIDPRFgiVTVKGAtgSKGAELKVRFDDV 48
PHA03247 PHA03247
large tegument protein UL36; Provisional
666-821 3.07e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.48  E-value: 3.07e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  666 GSRTPAQSGAWDPNNPNTPSRAEEEYeyaFDDEPTPSP-----------------QAYGGTPNPQTPGYPDPSSPQV--N 726
Cdd:PHA03247 2494 AAPDPGGGGPPDPDAPPAPSRLAPAI---LPDEPVGEPvhprmltwirgleelasDDAGDPPPPLPPAAPPAAPDRSvpP 2570
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  727 PQYNPQTPGtPAMyNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQA-SPSPSPVGY 805
Cdd:PHA03247 2571 PRPAPRPSE-PAV-TSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPdPHPPPTVPP 2648
                         170
                  ....*....|....*.
gi 149056473  806 SPMTPGAPSPGGYNPH 821
Cdd:PHA03247 2649 PERPRDDPAPGRVSRP 2664
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
701-829 3.45e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 50.92  E-value: 3.45e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  701 PSPQAYGGTPNPQTPGYPDPSS-PQVNPqynpqTPGTPAMynTDQFSPYAAPSPQGSyQPSPSPQSYHQVAPS--PAGYQ 777
Cdd:pfam03154 172 PVLQAQSGAASPPSPPPPGTTQaATAGP-----TPSAPSV--PPQGSPATSQPPNQT-QSTAAPHTLIQQTPTlhPQRLP 243
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|...
gi 149056473  778 NTHSPASYHPTPSPMAY-QASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQ 829
Cdd:pfam03154 244 SPHPPLQPMTQPPPPSQvSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQ 296
PHA03378 PHA03378
EBNA-3B; Provisional
603-815 3.54e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 51.22  E-value: 3.54e-06
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 603 QRLTTVDSQRPGGMTSTYGRTP---MYGSQTPmygsgsRTPMYGSQTP-LQDGSRTPHYGSQTPLHDGSRTPAQSGAWDP 678
Cdd:PHA03378 575 QPLTSPTTSQLASSAPSYAQTPwpvPHPSQTP------EPPTTQSHIPeTSAPRQWPMPLRPIPMRPLRMQPITFNVLVF 648
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 679 NNPNTPSRAEeeyeyafddeptPSPQAYGGTPNPQTPGYPDPSSPQVN--PQYNPQTPGTPAMYNTDQFSPYAAPS---- 752
Cdd:PHA03378 649 PTPHQPPQVE------------ITPYKPTWTQIGHIPYQPSPTGANTMlpIQWAPGTMQPPPRAPTPMRPPAAPPGraqr 716
                        170       180       190       200       210       220
                 ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 149056473 753 PQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPS-PMAYQASPSPSPVGyspmTPGAPSP 815
Cdd:PHA03378 717 PAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRArPPAAAPGRARPPAA----APGAPTP 776
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
334-378 3.89e-06

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 44.52  E-value: 3.89e-06
                         10        20        30        40
                 ....*....|....*....|....*....|....*....|....*....
gi 149056473 334 GDHVKVIAGRFEGDTGLIVRVEENFVIL----FSDLTMHELKVLPRDLQ 378
Cdd:cd00380    1 GDVVRVLRGPYKGREGVVVDIDPRFGIVtvkgATGSKGAELKVRFDDVD 49
PHA02682 PHA02682
ORF080 virion core protein; Provisional
660-845 3.91e-06

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 49.47  E-value: 3.91e-06
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 660 QTPLHDGSRTPAQSgawdpnnPNTPSRAEeeyeyafddePTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAM 739
Cdd:PHA02682  79 QSPLAPSPACAAPA-------PACPACAP----------AAPAPAVTCPAPAPACPPATAPTCPPPAVCPAPARPAPACP 141
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 740 YNTDQFSPyAAPSPQgsyqPSPSPqsyhqvAPSPAGYQNTHSPASYhPTPSPMAYQASPSPSPVgyspmtpgapspggYN 819
Cdd:PHA02682 142 PSTRQCPP-APPLPT----PKPAP------AAKPIFLHNQLPPPDY-PAASCPTIETAPAASPV--------------LE 195
                        170       180       190
                 ....*....|....*....|....*....|.
gi 149056473 820 PHTPGSGIEQNSSDWVT-----TDIQVKVRD 845
Cdd:PHA02682 196 PRIPDKIIDADNDDKDLikkelADIADSVRD 226
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
605-816 4.78e-06

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 50.69  E-value: 4.78e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  605 LTTVDSQRPGGMTSTYGRTPMYGSQTPM-YGSGSRTPMYGSQTPLQDgSRTPHYGSQTPlhdGSRTPAQSGAWDPNNPNT 683
Cdd:pfam05109 468 VSTADVTSPTPAGTTSGASPVTPSPSPRdNGTESKAPDMTSPTSAVT-TPTPNATSPTP---AVTTPTPNATSPTLGKTS 543
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  684 PSRAeeeYEYAFDDEPTPSPQAYGGTPNPQTPGY-------------PDPSSPQV---NPQYNPQ------TPGTPAMYN 741
Cdd:pfam05109 544 PTSA---VTTPTPNATSPTPAVTTPTPNATIPTLgktsptsavttptPNATSPTVgetSPQANTTnhtlggTSSTPVVTS 620
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  742 TDQFSPYAAPSPQGSYQPSPS------PQSYHQ-VAPSPAGYQNTHSP--ASYHPTPSPMAYQASPSPSPVGYSPMTPGA 812
Cdd:pfam05109 621 PPKNATSAVTTGQHNITSSSTssmslrPSSISEtLSPSTSDNSTSHMPllTSAHPTGGENITQVTPASTSTHHVSTSSPA 700

                  ....
gi 149056473  813 PSPG 816
Cdd:pfam05109 701 PRPG 704
PHA03291 PHA03291
envelope glycoprotein I; Provisional
694-871 5.81e-06

envelope glycoprotein I; Provisional


Pssm-ID: 223033 [Multi-domain]  Cd Length: 401  Bit Score: 49.57  E-value: 5.81e-06
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 694 AFDDEPTPSPQAYGGTPnpqTPGYPDPSSPQVNPQYNPqtpgtpamynTDQFSPyAAPSPQGSYQPSPspqsyhQVAPSP 773
Cdd:PHA03291 165 AFPAEGTLAAPPLGEGS---ADGSCDPALPLSAPRLGP----------ADVFVP-ATPRPTPRTTASP------ETTPTP 224
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 774 AgyqNTHSPASyHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNSSDWVTTDIQVkvrdtyldTQIV 853
Cdd:PHA03291 225 S---TTTSPPS-TTIPAPSTTIAAPQAGTTPEAEGTPAPPTPGGGEAPPANATPAPEASRYELTVTQI--------IQIA 292
                        170
                 ....*....|....*...
gi 149056473 854 GQTGVIRSVTGGMCSVYL 871
Cdd:PHA03291 293 IPASIIACVFLGSCACCL 310
PHA03247 PHA03247
large tegument protein UL36; Provisional
670-827 9.05e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 49.94  E-value: 9.05e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  670 PAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQV-----NPQYNPQTPGTPAMyNTDQ 744
Cdd:PHA03247 2694 SLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPAtpggpARPARPPTTAGPPA-PAPP 2772
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  745 FSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPAS-YHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTP 823
Cdd:PHA03247 2773 AAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLaPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPL 2852

                  ....
gi 149056473  824 GSGI 827
Cdd:PHA03247 2853 GGSV 2856
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
646-833 9.26e-06

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 49.78  E-value: 9.26e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  646 TPLQDGSRTPHYGSQ-----------TPLHDGSRTPAqSGAWDPNNPNTPSRAEE-EYEYAFDDEPTPSPQAYGGTPNPQ 713
Cdd:PHA03307   26 ATPGDAADDLLSGSQgqlvsdsaelaAVTVVAGAAAC-DRFEPPTGPPPGPGTEApANESRSTPTWSLSTLAPASPAREG 104
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  714 TPGYPDPSSPqvnpqynpqtPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNThSPASYHPTP---- 789
Cdd:PHA03307  105 SPTPPGPSSP----------DPPPPTPPPASPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPA-AVASDAASSrqaa 173
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*.
gi 149056473  790 --SPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNSSD 833
Cdd:PHA03307  174 lpLSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASS 219
PRK10263 PRK10263
DNA translocase FtsK; Provisional
720-831 1.22e-05

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 49.31  E-value: 1.22e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  720 PSSPQVNPQYNPQTpgtpamynTDQFSPYAAPSPQGSYQPSPSPQSYHQ-VAPSPAGYQNTHSPASYHPTPSPMAYQASP 798
Cdd:PRK10263  740 PHEPLFTPIVEPVQ--------QPQQPVAPQQQYQQPQQPVAPQPQYQQpQQPVAPQPQYQQPQQPVAPQPQYQQPQQPV 811
                          90       100       110
                  ....*....|....*....|....*....|...
gi 149056473  799 SPSPVGYSPMTPGAPSPGGYNPHTPGSGIEQNS 831
Cdd:PRK10263  812 APQPQYQQPQQPVAPQPQYQQPQQPVAPQPQDT 844
KOW cd00380
KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known ...
456-493 1.24e-05

KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues.


Pssm-ID: 240504  Cd Length: 49  Bit Score: 43.36  E-value: 1.24e-05
                         10        20        30
                 ....*....|....*....|....*....|....*...
gi 149056473 456 KDIVKVIDGPHSGREGEIRHLYRSFAFLHCKKLVENGG 493
Cdd:cd00380    1 GDVVRVLRGPYKGREGVVVDIDPRFGIVTVKGATGSKG 38
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
698-821 2.57e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.06  E-value: 2.57e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 698 EPTPSPQAYGGTPNPQTPGyPDPSSPQVNPQYNPQTPGTPAmyntdqfsPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQ 777
Cdd:PRK07764 391 AGAPAAAAPSAAAAAPAAA-PAPAAAAPAAAAAPAPAAAPQ--------PAPAPAPAPAPPSPAGNAPAGGAPSPPPAAA 461
                         90       100       110       120
                 ....*....|....*....|....*....|....*....|....
gi 149056473 778 NTHSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPH 821
Cdd:PRK07764 462 PSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAG 505
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
671-823 2.94e-05

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 45.41  E-value: 2.94e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  671 AQSGAWDPNNPNTPSR-AEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQfsPYA 749
Cdd:pfam15240  16 AQSSSEDVSQEDSPSLiSEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPGGPQQPPPQGGKQKPQGPPP--QGG 93
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 149056473  750 APSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPvGYSPMTPGAPSPGGYNPHTP 823
Cdd:pfam15240  94 PRPPPGKPQGPPPQGGNQQQGPPPPGKPQGPPPQGGGPPPQGGNQQGPPPPPP-GNPQGPPQRPPQPGNPQGPP 166
PHA03247 PHA03247
large tegument protein UL36; Provisional
630-823 3.16e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.01  E-value: 3.16e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  630 TPMYGSGSRTPmyGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGT 709
Cdd:PHA03247 2741 PPAVPAGPATP--GGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAAL 2818
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  710 PNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFS-----PYAAPSPQGSYQPSPSPQSYHQVA--PSPAGYQNTHS- 781
Cdd:PHA03247 2819 PPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSvapggDVRRRPPSRSPAAKPAAPARPPVRrlARPAVSRSTESf 2898
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*
gi 149056473  782 ---PASYHPTPSPMAyQASPSPSPVGYSPMTPGAPSPGGYNPHTP 823
Cdd:PHA03247 2899 alpPDQPERPPQPQA-PPPPQPQPQPPPPPQPQPPPPPPPRPQPP 2942
SP7_N cd22542
N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins ...
657-825 3.74e-05

N-terminal domain of transcription factor Specificity Protein (SP) 7; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP7, also called Osterix (Osx) in humans, is highly conserved among bone-forming vertebrates. It plays a major role, along with Runx2 and Dlx5 in driving the differentiation of mesenchymal precursor cells into osteoblasts and eventually osteocytes. SP7 also plays a regulatory role by inhibiting chondrocyte differentiation, maintaining the balance between differentiation of mesenchymal precursor cells into ossified bone or cartilage. Mutations of this gene have been associated with multiple dysfunctional bone phenotypes in vertebrates. SP7 is thought to play a role in diseases such as Osteogenesis imperfecta. SP7 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP7.


Pssm-ID: 411691 [Multi-domain]  Cd Length: 297  Bit Score: 46.82  E-value: 3.74e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 657 YGSQTPLHDgSRTPAQSGAWDPNNPNTP--------SRAEE----EYEYAFDD-----EPTPSPQA---YGGTPNPQTPG 716
Cdd:cd22542   26 FGGSSPIRD-SATPGKPGNNPGKKPYSLgsdlssakSRSSElmgdSYTATFSSgnglmSPSGSPQAsttYGNDYNPFSHS 104
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 717 YPDPSSPQ----VNPQYNPQTPGTPAMYNT-DQFSPY-----AAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYH 786
Cdd:cd22542  105 FPTSSGSQdpslLVSKGHPSADCLPSVYTSlDMAHPYgswykTGIHPGISSSSTNATASWWDMHSNTNWLSAQGQPDGLQ 184
                        170       180       190
                 ....*....|....*....|....*....|....*....
gi 149056473 787 PTPSPMAYQASPSPSPVGYSPMTPgaPSPGGYNPHTPGS 825
Cdd:cd22542  185 ASLQPVPAQTPLNPQLPSYTEFTT--LNPAPYPAVGISS 221
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
596-815 4.68e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 47.45  E-value: 4.68e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  596 QTISVDRQRLTTVDSQRPGGMTS-TYGRTPMYGSQTPMYGSGSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSG 674
Cdd:pfam03154  24 QTASPDGRASPTNEDLRSSGRNSpSAASTSSNDSKAESMKKSSKKIKEEAPSPLKSAKRQREKGASDTEEPERATAKKSK 103
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  675 AWDPNNPNTPSRAEEEYE--YAFDDEPTPSPQAYGGTPNPQTPGYPDP--------SS----------PQVNPQYNPQTP 734
Cdd:pfam03154 104 TQEISRPNSPSEGEGESSdgRSVNDEGSSDPKDIDQDNRSTSPSIPSPqdnesdsdSSaqqqilqtqpPVLQAQSGAASP 183
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  735 GTPAMYNTDQfSPYAAPSPQGsyqPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPvGYSPMTPGAPS 814
Cdd:pfam03154 184 PSPPPPGTTQ-AATAGPTPSA---PSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSPHP-PLQPMTQPPPP 258

                  .
gi 149056473  815 P 815
Cdd:pfam03154 259 S 259
KOW smart00739
KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.
330-357 4.74e-05

KOW (Kyprides, Ouzounis, Woese) motif; Motif in ribosomal proteins, NusG, Spt5p, KIN17 and T54.


Pssm-ID: 128978  Cd Length: 28  Bit Score: 41.16  E-value: 4.74e-05
                           10        20
                   ....*....|....*....|....*...
gi 149056473   330 YFKMGDHVKVIAGRFEGDTGLIVRVEEN 357
Cdd:smart00739   1 KFEVGDTVRVIAGPFKGKVGKVLEVDGE 28
PRK14959 PRK14959
DNA polymerase III subunits gamma and tau; Provisional
699-813 6.40e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 184923 [Multi-domain]  Cd Length: 624  Bit Score: 46.60  E-value: 6.40e-05
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 699 PTPSPQAYGGTPNPQTPGyPDPSSPQVNPQYNPQTPGTPAmyntdqfsPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQN 778
Cdd:PRK14959 387 EGPASGGAATIPTPGTQG-PQGTAPAAGMTPSSAAPATPA--------PSAAPSPRVPWDDAPPAPPRSGIPPRPAPRMP 457
                         90       100       110
                 ....*....|....*....|....*....|....*.
gi 149056473 779 THSPASYHPTPSPMAYQASPSPS-PVGYSPMTPGAP 813
Cdd:PRK14959 458 EASPVPGAPDSVASASDAPPTLGdPSDTAEHTPSGP 493
PHA03247 PHA03247
large tegument protein UL36; Provisional
638-817 8.62e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.86  E-value: 8.62e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  638 RTPMYGSQTPLQDGSRTPHYGSQTPLhdgsRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPS--------------P 703
Cdd:PHA03247 2599 RAPVDDRGDPRGPAPPSPLPPDTHAP----DPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGrvsrprrarrlgraA 2674
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  704 QAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAP-SPQGSYQPSPSPQSYHQVAPSPAGyqnTHSP 782
Cdd:PHA03247 2675 QASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPpGPAAARQASPALPAAPAPPAVPAG---PATP 2751
                         170       180       190
                  ....*....|....*....|....*....|....*..
gi 149056473  783 ASYHPTPSPMAYQASPSPSPVGYSPMTP--GAPSPGG 817
Cdd:PHA03247 2752 GGPARPARPPTTAGPPAPAPPAAPAAGPprRLTRPAV 2788
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
561-592 8.85e-05

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 40.45  E-value: 8.85e-05
                          10        20        30
                  ....*....|....*....|....*....|..
gi 149056473  561 IGQTVRISQGPYKGYIGVVKDATESTARVELH 592
Cdd:pfam00467   1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
650-838 1.17e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.13  E-value: 1.17e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 650 DGSRTPHYGSQTPLHDGSR--TPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNP 727
Cdd:PRK07764 596 GGEGPPAPASSGPPEEAARpaAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGG 675
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 728 QYNPQTPGTPAMyntdqfSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSP 807
Cdd:PRK07764 676 AAPAAPPPAPAP------AAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPP 749
                        170       180       190
                 ....*....|....*....|....*....|.
gi 149056473 808 MTPGAPSPGGYNPHTPGSGIEQNSSDWVTTD 838
Cdd:PRK07764 750 DPAGAPAQPPPPPAPAPAAAPAAAPPPSPPS 780
PHA03247 PHA03247
large tegument protein UL36; Provisional
637-815 1.18e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.47  E-value: 1.18e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  637 SRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEyeyAFDDEPTPsPQAYGGTPNPQTPG 716
Cdd:PHA03247 2710 PAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTT---AGPPAPAP-PAAPAAGPPRRLTR 2785
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  717 YP----DPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPqgsyqPSPSPQSYHQVAPSPA-----------GYQNTHS 781
Cdd:PHA03247 2786 PAvaslSESRESLPSPWDPADPPAAVLAPAAALPPAASPAG-----PLPPPTSAQPTAPPPPpgppppslplgGSVAPGG 2860
                         170       180       190
                  ....*....|....*....|....*....|....*
gi 149056473  782 PASYHPTP-SPMAYQASPSPSPVGYSPMTPGAPSP 815
Cdd:PHA03247 2861 DVRRRPPSrSPAAKPAAPARPPVRRLARPAVSRST 2895
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
665-825 1.34e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 45.93  E-value: 1.34e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  665 DGSRTPAQSGAWDPNNPNTPSRAeeeyeyAFDDEPTPSPQAYGGTPNPQTPGYPDPSSP--QVNPQYNPQTPGTPAMYNT 742
Cdd:PHA03307  238 DSSSSESSGCGWGPENECPLPRP------APITLPTRIWEASGWNGPSSRPGPASSSSSprERSPSPSPSSPGSGPAPSS 311
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  743 DQFSPYAAPSPQGSyQPSPSPQSyhqVAPSPAGyqnTHSPASYHPTPSPmayqASPSPSPVGYSPMTPGAPSPGGYNPHT 822
Cdd:PHA03307  312 PRASSSSSSSRESS-SSSTSSSS---ESSRGAA---VSPGPSPSRSPSP----SRPPPPADPSSPRKRPRPSRAPSSPAA 380

                  ...
gi 149056473  823 PGS 825
Cdd:PHA03307  381 SAG 383
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
700-815 1.40e-04

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 45.68  E-value: 1.40e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  700 TPSPQAYGGTPNPQTPGY--PD-----PSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPS 772
Cdd:pfam05109 422 SKAPESTTTSPTLNTTGFaaPNtttglPSSTHVPTNLTAPASTGPTVSTADVTSPTPAGTTSGASPVTPSPSPRDNGTES 501
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|...
gi 149056473  773 PAgyQNTHSPASYHPTPSPMAyqasPSPSPVGYSPmTPGAPSP 815
Cdd:pfam05109 502 KA--PDMTSPTSAVTTPTPNA----TSPTPAVTTP-TPNATSP 537
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
710-896 1.59e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 45.57  E-value: 1.59e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 710 PNPQTPGYPDPSSPQVNPQYNPQTPGTPAMyntdqfspyAAPSPQGSYQPSPSPQSyhQVAPSPAGYQNTHSPASYHPTP 789
Cdd:PRK14950 364 PAPQPAKPTAAAPSPVRPTPAPSTRPKAAA---------AANIPPKEPVRETATPP--PVPPRPVAPPVPHTPESAPKLT 432
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 790 spmayqasPSPSPVGYSPMTPGAPSPGGYNP--HTPGSGIEQNSSDWVTTDIQVKVRDTYLdtQIVGQTGViRSVTggmc 867
Cdd:PRK14950 433 --------RAAIPVDEKPKYTPPAPPKEEEKalIADGDVLEQLEAIWKQILRDVPPRSPAV--QALLSSGV-RPVS---- 497
                        170       180       190
                 ....*....|....*....|....*....|
gi 149056473 868 svyLKDSEKVVSISSE-HLEPITPTKNNKV 896
Cdd:PRK14950 498 ---VEKNTLTLSFKSKfHKDKIEEPENRKI 524
Pneumo_att_G pfam05539
Pneumovirinae attachment membrane glycoprotein G;
607-789 1.65e-04

Pneumovirinae attachment membrane glycoprotein G;


Pssm-ID: 114270 [Multi-domain]  Cd Length: 408  Bit Score: 45.04  E-value: 1.65e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  607 TVDSQRPGGMTSTYGRTPMYGSQTPMYGSGSRTPMYGSQT----PLQDG-SRTPHYGSQTPLHDgSRTPAQSGAWDP--N 679
Cdd:pfam05539 194 TPQSQPATQGHQTATANQRLSSTEPVGTQGTTTSSNPEPQteppPSQRGpSGSPQHPPSTTSQD-QSTTGDGQEHTQrrK 272
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  680 NPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPyaaPSPQ----- 754
Cdd:pfam05539 273 TPPATSNRRSPHSTATPPPTTKRQETGRPTPRPTATTQSGSSPPHSSPPGVQANPTTQNLVDCKELDP---PKPNsicyg 349
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 149056473  755 -GSYQPSpSPQSYHQVAPSPAGYqNTHSPASYHPTP 789
Cdd:pfam05539 350 vGIYNEA-LPRGCDIVVPLCSTY-TIMCMDTYYSKP 383
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
670-825 1.71e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 45.36  E-value: 1.71e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 670 PAQSGAWDPNNPNTPSRAEEEYEYAfdDEPTPS---PQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFS 746
Cdd:PRK07764 592 PGAAGGEGPPAPASSGPPEEAARPA--APAAPAapaAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGW 669
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 747 PYAAPSPQGSyQPSPSPQSYHQVAPS--PAGYQNTHSPASYHP---------TPSPMAYQASPSPSPVGYSPMTPGAPSP 815
Cdd:PRK07764 670 PAKAGGAAPA-APPPAPAPAAPAAPAgaAPAQPAPAPAATPPAgqaddpaaqPPQAAQGASAPSPAADDPVPLPPEPDDP 748
                        170
                 ....*....|
gi 149056473 816 GGYNPHTPGS 825
Cdd:PRK07764 749 PDPAGAPAQP 758
dnaA PRK14086
chromosomal replication initiator protein DnaA;
640-804 1.82e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 45.20  E-value: 1.82e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 640 PMY-GSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNnPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYP 718
Cdd:PRK14086 103 RRTsEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTAR-PAYPAYQQRPEPGAWPRAADDYGWQQQRLGFPPRAPYA 181
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 719 DPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSY-------QPSPSPQSYHQV--APSPAGYQNTHSPASYHPTP 789
Cdd:PRK14086 182 SPASYAPEQERDREPYDAGRPEYDQRRRDYDHPRPDWDRprrdrtdRPEPPPGAGHVHrgGPGPPERDDAPVVPIRPSAP 261
                        170
                 ....*....|....*
gi 149056473 790 SPMAYQASPSPSPVG 804
Cdd:PRK14086 262 GPLAAQPAPAPGPGE 276
PHA03247 PHA03247
large tegument protein UL36; Provisional
681-823 1.92e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 45.70  E-value: 1.92e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  681 PNTPS-RAEEEYEYAFDDEPTPSPQAyGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMY----NTDQFSPYAAPSPQG 755
Cdd:PHA03247 2475 PGAPVyRRPAEARFPFAAGAAPDPGG-GGPPDPDAPPAPSRLAPAILPDEPVGEPVHPRMLtwirGLEELASDDAGDPPP 2553
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  756 SYQPSPSPQSYHQVAPSPagyqnthSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAP--SPGGYNPHTP 823
Cdd:PHA03247 2554 PLPPAAPPAAPDRSVPPP-------RPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDrgDPRGPAPPSP 2616
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
679-815 2.04e-04

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 45.14  E-value: 2.04e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 679 NNPNTPSRAEEEYEYAFDD-EPTPSPQAYGGTPN-PQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQG- 755
Cdd:NF033839 249 DNVNTKVEIENTVHKIFADmDAVVTKFKKGLTQDtPKEPGNKKPSAPKPGMQPSPQPEKKEVKPEPETPKPEVKPQLEKp 328
                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 149056473 756 SYQPSPSPQSYH-QVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSpvgySPMTPGAPSP 815
Cdd:NF033839 329 KPEVKPQPEKPKpEVKPQLETPKPEVKPQPEKPKPEVKPQPEKPKPE----VKPQPETPKP 385
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
334-362 2.06e-04

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 39.29  E-value: 2.06e-04
                          10        20        30
                  ....*....|....*....|....*....|.
gi 149056473  334 GDHVKVIAGRFEGDTGLIVRVEE--NFVILF 362
Cdd:pfam00467   2 GDVVRVIAGPFKGKVGKVVEVDDkkNRVLVE 32
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
651-823 2.59e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 45.16  E-value: 2.59e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  651 GSRTPHYGSQTPLHDGSRTPAQSGAWDPnnPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQvnPQYN 730
Cdd:PHA03307  773 ALLEPAEPQRGAGSSPPVRAEAAFRRPG--RLRRSGPAADAASRTASKRKSRSHTPDGGSESSGPARPPGAAAR--PPPA 848
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  731 PQTPGTPAMyntDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASyHPTPSPMAyqaspsPSPVGYSPMTP 810
Cdd:PHA03307  849 RSSESSKSK---PAAAGGRARGKNGRRRPRPPEPRARPGAAAPPKAAAAAPPAG-APAPRPRP------APRVKLGPMPP 918
                         170       180
                  ....*....|....*....|
gi 149056473  811 GAPSP-GGY------NPHTP 823
Cdd:PHA03307  919 GGPDPrGGFrrvppgDLHTP 938
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
671-815 3.06e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 44.76  E-value: 3.06e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  671 AQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGT---PNPQTPGYPDPSSPQVNPQYNPQTpgTPAMYNTDQFSP 747
Cdd:pfam03154 176 AQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATsqpPNQTQSTAAPHTLIQQTPTLHPQR--LPSPHPPLQPMT 253
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  748 YAAPSPQGSYQPSPSPQSY------------------HQVAPSPAGYQNTHSPASYHPTPSPMAyqASPSPSPVGYSPMT 809
Cdd:pfam03154 254 QPPPPSQVSPQPLPQPSLHgqmppmphslqtgpshmqHPVPPQPFPLTPQSSQSQVPPGPSPAA--PGQSQQRIHTPPSQ 331

                  ....*.
gi 149056473  810 PGAPSP 815
Cdd:pfam03154 332 SQLQSQ 337
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
701-832 3.72e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.48  E-value: 3.72e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 701 PSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTH 780
Cdd:PRK12323 392 PAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPAAAGPRP 471
                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|....*
gi 149056473 781 SPASYHPTPSPMAYQASPSPSPVGYSP---MTPGAPSPGGYNPHTPGSGIEQNSS 832
Cdd:PRK12323 472 VAAAAAAAPARAAPAAAPAPADDDPPPweeLPPEFASPAPAQPDAAPAGWVAESI 526
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
647-816 3.72e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.39  E-value: 3.72e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  647 PLQDGSRTPHYGSQT--PLHDGSRTPAQSGAWDPNNP-------NTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGY 717
Cdd:PHA03307  195 PSTPPAAASPRPPRRssPISASASSPAPAPGRSAADDagasssdSSSSESSGCGWGPENECPLPRPAPITLPTRIWEASG 274
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  718 PDPSSPQVNPQYNPQTPGtpamyntdqfSPYAAPSPQ---GSYQPSPSPQSYHQVAPSPAGyqnTHSPASYHPTPSPMAY 794
Cdd:PHA03307  275 WNGPSSRPGPASSSSSPR----------ERSPSPSPSspgSGPAPSSPRASSSSSSSRESS---SSSTSSSSESSRGAAV 341
                         170       180
                  ....*....|....*....|..
gi 149056473  795 QASPSPSPVGYSPMTPGAPSPG 816
Cdd:PHA03307  342 SPGPSPSRSPSPSRPPPPADPS 363
PTZ00395 PTZ00395
Sec24-related protein; Provisional
643-821 3.93e-04

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 44.68  E-value: 3.93e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  643 GSQTPLQDGSRTPHYGSQTPL-HDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTP--NP--QTPGY 717
Cdd:PTZ00395  345 GSPNAASAGAPFNGLGNQADGgHINQVHPDARGAWAGGPHSNASYNCAAYSNAAQSNAAQSNAGFSNAGysNPgnSNPGY 424
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  718 PDP---SSPQVNPQY------NPQTPGTPamYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPT 788
Cdd:PTZ00395  425 NNApnsNTPYNNPPNsntpysNPPNSNPP--YSNLPYSNTPYSNAPLSNAPPSSAKDHHSAYHAAYQHRAANQPAANLPT 502
                         170       180       190
                  ....*....|....*....|....*....|...
gi 149056473  789 PSPMAyqASPSPSPVGYSPMTPGAPSPGGYNPH 821
Cdd:PTZ00395  503 ANQPA--ANNFHGAAGNSVGNPFASRPFGSAPY 533
dnaA PRK14086
chromosomal replication initiator protein DnaA;
698-824 4.05e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 44.05  E-value: 4.05e-04
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 698 EPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQsyhqvAPSPAGYQ 777
Cdd:PRK14086  94 EPAPPPPHARRTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQLPTARPAYPAYQQRPEPGAWPR-----AADDYGWQ 168
                         90       100       110       120
                 ....*....|....*....|....*....|....*....|....*...
gi 149056473 778 NT-HSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGyNPHTPG 824
Cdd:PRK14086 169 QQrLGFPPRAPYASPASYAPEQERDREPYDAGRPEYDQRRR-DYDHPR 215
DUF1373 pfam07117
Protein of unknown function (DUF1373); This family consists of several hypothetical proteins ...
681-795 5.51e-04

Protein of unknown function (DUF1373); This family consists of several hypothetical proteins which seem to be specific to Oryzias latipes (Japanese ricefish). Members of this family are typically around 200 residues in length. The function of this family is unknown.


Pssm-ID: 462093 [Multi-domain]  Cd Length: 212  Bit Score: 42.47  E-value: 5.51e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  681 PNTPSRAEEEYEY----AFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGS 756
Cdd:pfam07117  42 PPRPEEEEGQGGGggtfPFPGSPEPEPGGGGSGPMPMSASAPEPEPAKAKPQRPAPAQGHGHGGGGDSDSSGSGSGHQGS 121
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|..
gi 149056473  757 YQP---SPSPQSYHQVAPSPAGYQNTHSPasyHPTPSPMAYQ 795
Cdd:pfam07117 122 GGAgagAGAPGHQHEQEQESSSSDDDDED---EFEFTPEEDE 160
PRK10263 PRK10263
DNA translocase FtsK; Provisional
640-805 5.66e-04

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 43.92  E-value: 5.66e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  640 PMYGSQTPLQDGSR--TPHYGSQTPlhDGSRTPAQSGaWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGY 717
Cdd:PRK10263  345 PVASVDVPPAQPTVawQPVPGPQTG--EPVIAPAPEG-YPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYY 421
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  718 -PDPSSPQVNPQYNPQtPGTPAMYNtdqfsPYAAPSPQGSYQPSPSPQSYHQ-VAPSPAGYQNTHSPASYHPT---PSPM 792
Cdd:PRK10263  422 aPAPEQPAQQPYYAPA-PEQPVAGN-----AWQAEEQQSTFAPQSTYQTEQTyQQPAAQEPLYQQPQPVEQQPvvePEPV 495
                         170
                  ....*....|...
gi 149056473  793 AYQASPSPSPVGY 805
Cdd:PRK10263  496 VEETKPARPPLYY 508
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
622-815 6.03e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.01  E-value: 6.03e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  622 RTPMYGSQTPMYGSGSRTPMYGSQTPLQDGSRTPhygsQTPLHDGSRTPAQSGAwDPNNPNTPSRAEeeyeyafdDEPTP 701
Cdd:PHA03307   64 RFEPPTGPPPGPGTEAPANESRSTPTWSLSTLAP----ASPAREGSPTPPGPSS-PDPPPPTPPPAS--------PPPSP 130
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  702 SPQAYGGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYNTdqfSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGyqntHS 781
Cdd:PHA03307  131 APDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQ---AALPLSSPEETARAPSSPPAEPPPSTPPAA----AS 203
                         170       180       190
                  ....*....|....*....|....*....|....
gi 149056473  782 PASyhPTPSPMAyqASPSPSPVGYSPMTPGAPSP 815
Cdd:PHA03307  204 PRP--PRRSSPI--SASASSPAPAPGRSAADDAG 233
Pneumo_att_G pfam05539
Pneumovirinae attachment membrane glycoprotein G;
667-870 9.45e-04

Pneumovirinae attachment membrane glycoprotein G;


Pssm-ID: 114270 [Multi-domain]  Cd Length: 408  Bit Score: 42.73  E-value: 9.45e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  667 SRTPAQSGAWDPN--NPNTPS-RAEEEYEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQvnPQYNPqtpgtPAMYNTD 743
Cdd:pfam05539 171 AVTTSKTTSWPTEvsHPTYPSqVTPQSQPATQGHQTATANQRLSSTEPVGTQGTTTSSNPE--PQTEP-----PPSQRGP 243
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  744 QFSPYAAPS-PQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASyHPTPSPM-----AYQASPSPSPVGYSPMTPGAPSPGG 817
Cdd:pfam05539 244 SGSPQHPPStTSQDQSTTGDGQEHTQRRKTPPATSNRRSPHS-TATPPPTtkrqeTGRPTPRPTATTQSGSSPPHSSPPG 322
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 149056473  818 --YNPHTpgsgieQNSSDWVTTDIQVKVRDTY-LDTQIVGQTGVIRSVTgGMCSVY 870
Cdd:pfam05539 323 vqANPTT------QNLVDCKELDPPKPNSICYgVGIYNEALPRGCDIVV-PLCSTY 371
Med26_M pfam15694
Mediator complex subunit 26 middle domain; Med26_M is the middle domain of subunit 26 of ...
713-829 1.01e-03

Mediator complex subunit 26 middle domain; Med26_M is the middle domain of subunit 26 of Mediator. Med19 and Med26 act synergistically to mediate the interaction between REST (a Kruppel-type zinc finger transcription factor that binds to a 21-bp RE1 silencing element present in over 900 human genes) and Mediator.


Pssm-ID: 464807 [Multi-domain]  Cd Length: 255  Bit Score: 41.78  E-value: 1.01e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473  713 QTPGYPDPSSPQvNPQYNPQTpgTPAMYNTDQFSPYAapsPQGSY-QPSPSPQSYHQVAPSPAGYQNTHSP--------A 783
Cdd:pfam15694  81 ETGGPPQPKSPR-CSSFSPRN--SRHETFARRSSTYA---PKGSVpSPSPRSQVLDAQVPSPLPLSQPSTPpvqakrleK 154
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|..
gi 149056473  784 SYHPTP-SPMAYQASPS-----PSPVGYSPMTPGAPSPGGYNPHTPGSGIEQ 829
Cdd:pfam15694 155 PPQSSPeSSQHWLEQSDseshqRHQDGSATLLSQSVSPGCKTPLHPGENSLP 206
KLF1_2_4_N cd21972
N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel ...
684-823 1.10e-03

N-terminal domain of Kruppel-like factor (KLF) 1, KLF2, KLF4, and similar proteins; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the related N-terminal domains of KLF1, KLF2, KLF4, and similar proteins.


Pssm-ID: 409230 [Multi-domain]  Cd Length: 194  Bit Score: 41.12  E-value: 1.10e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 684 PSRAEEEYEYAFDDEPTPSPQAYG----GTPNPQTPGYPDPSSPQVNPQYN-PQTPGTPAMYNTDQFSPYAAPSPQGSYQ 758
Cdd:cd21972   22 LDLEFILSNTVTSDNDNPPPPDPAypppESPESCSTVYDSDGCHPTPNAYCgPNGPGLPGHFLLAGNSPNLGPKIKTENQ 101
                         90       100       110       120       130       140       150
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 149056473 759 PS-------PSPQSYHQVAPS------PAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPmtPGAPSPGGYNPHTP 823
Cdd:cd21972  102 EQacmpvagYSGHYGPREPQRvppappPPQYAGHFQYHGHFNMFSPPLRANHPGMSTVMLTP--LSTPPLGFLSPEEA 177
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
707-826 1.17e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 42.67  E-value: 1.17e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 707 GGTPNPQTPGYPDPSSPQVnPQYNPQTPGTPAMyntdqfspyAAPSPQGSYQPSPSPQSYHQVAPSPAgyqnthSPASYH 786
Cdd:PRK07764 389 GGAGAPAAAAPSAAAAAPA-AAPAPAAAAPAAA---------AAPAPAAAPQPAPAPAPAPAPPSPAG------NAPAGG 452
                         90       100       110       120
                 ....*....|....*....|....*....|....*....|
gi 149056473 787 PTPSPMAYQASPSPSPVGYSPMTPGAPSPGGYNPHTPGSG 826
Cdd:PRK07764 453 APSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAA 492
PHA03369 PHA03369
capsid maturational protease; Provisional
669-766 1.23e-03

capsid maturational protease; Provisional


Pssm-ID: 223061 [Multi-domain]  Cd Length: 663  Bit Score: 42.68  E-value: 1.23e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 669 TPAQSGAWDPNnPNTPSRAEEE-YEYAFDDEPTPSPQAYGGTPNPQTPGYPDPSSPQVnpqynPQTPGTPAMYNTDQFSP 747
Cdd:PHA03369 353 LTAPSRVLAAA-AKVAVIAAPQtHTGPADRQRPQRPDGIPYSVPARSPMTAYPPVPQF-----CGDPGLVSPYNPQSPGT 426
                         90
                 ....*....|....*....
gi 149056473 748 YAAPSPQGSYQPSPSPQSY 766
Cdd:PHA03369 427 SYGPEPVGPVPPQPTNPYV 445
dnaA PRK14086
chromosomal replication initiator protein DnaA;
678-826 1.47e-03

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 42.51  E-value: 1.47e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 678 PNNPNTPSRAEEeyeyafddeptPSPQAYGGTPNPQTPGYPDPSSPQVNPQYnPQTPGTPAMY--NTDQFSPYAAPSPQG 755
Cdd:PRK14086  96 APPPPHARRTSE-----------PELPRPGRRPYEGYGGPRADDRPPGLPRQ-DQLPTARPAYpaYQQRPEPGAWPRAAD 163
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 756 SYQPSPSPQSYhqvaPSPAGYQnthSPASYHPTPSPMAY----------QASPSPSPVGYSPMTPGA-------PSPGGY 818
Cdd:PRK14086 164 DYGWQQQRLGF----PPRAPYA---SPASYAPEQERDREpydagrpeydQRRRDYDHPRPDWDRPRRdrtdrpePPPGAG 236

                 ....*...
gi 149056473 819 NPHTPGSG 826
Cdd:PRK14086 237 HVHRGGPG 244
PHA03325 PHA03325
nuclear-egress-membrane-like protein; Provisional
659-820 1.53e-03

nuclear-egress-membrane-like protein; Provisional


Pssm-ID: 223044  Cd Length: 418  Bit Score: 42.18  E-value: 1.53e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 659 SQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEYAFDDEPTPSPQAyggtpnpqtpgYPDPSSPQVNPQYNPQTPGTPA 738
Cdd:PHA03325 266 SSLPTSAPKRRSRRAGAMRAAAGETADLADDDGSEHSDPEPLPASLP-----------PPPVRRPRVKHPEAGKEEPDGA 334
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 739 MYNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGyqnthSPASYHPTPSPMAYQASPSPSPVGYSPMTPGAPSPGGY 818
Cdd:PHA03325 335 RNAEAKEPAQPATSTSSKGSSSAQNKDSGSTGPGSSL-----AAASSFLEDDDFGSPPLDLTTSLRHMPSPSVTSAPEPP 409

                 ..
gi 149056473 819 NP 820
Cdd:PHA03325 410 SI 411
PHA03264 PHA03264
envelope glycoprotein D; Provisional
638-827 1.81e-03

envelope glycoprotein D; Provisional


Pssm-ID: 223029 [Multi-domain]  Cd Length: 416  Bit Score: 41.91  E-value: 1.81e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 638 RTPMYGSQTPLQDGSR---------TPHYgsQTPLHDgsrtpAQSGAWDPNNPNTPSRAeeeYEYAFDDEPTPSPQayGG 708
Cdd:PHA03264 206 RGYTFGACFPDEDYEQrkvlrltylTQYY--PQEAHK-----AIVDYWFMRHGGVVPPY---FEESKGYEPPPAPS--GG 273
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 709 TPNPqtPGYPDPsspQVNPQYNPQTPGTPAMYNTDQFSPYAAPSPQGSYQPSPSPQSyhqvapspagyqnthspasyhPT 788
Cdd:PHA03264 274 SPAP--PGDDRP---EAKPEPGPVEDGAPGRETGGEGEGPEPAGRDGAAGGEPKPGP---------------------PR 327
                        170       180       190       200
                 ....*....|....*....|....*....|....*....|.
gi 149056473 789 PSPMAYQAS--PSPSPVGYSPMTPGAPSPGGYNPHTPGSGI 827
Cdd:PHA03264 328 PAPDADRPEgwPSLEAITFPPPTPATPAVPRARPVIVGTGI 368
PHA03369 PHA03369
capsid maturational protease; Provisional
709-846 1.91e-03

capsid maturational protease; Provisional


Pssm-ID: 223061 [Multi-domain]  Cd Length: 663  Bit Score: 41.91  E-value: 1.91e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 709 TPNPQTPGYPDPSSPQVNPqynPQTPGTPAM---YNTDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSpAGYQNTHSPASY 785
Cdd:PHA03369 353 LTAPSRVLAAAAKVAVIAA---PQTHTGPADrqrPQRPDGIPYSVPARSPMTAYPPVPQFCGDPGLV-SPYNPQSPGTSY 428
                         90       100       110       120       130       140
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 149056473 786 HPTPSPMAYQASPSPS--PVGYSPMT-PGAPSPGGYnpHTPGS-GIEQNSSDWVTTDIQVKVRDT 846
Cdd:PHA03369 429 GPEPVGPVPPQPTNPYvmPISMANMVyPGHPQEHGH--ERKRKrGGELKEELIETLKLVKKLKEE 491
KLF3_N cd21577
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ...
746-821 1.94e-03

N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.


Pssm-ID: 410554 [Multi-domain]  Cd Length: 214  Bit Score: 40.79  E-value: 1.94e-03
                         10        20        30        40        50        60        70
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 149056473 746 SPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHsPASYHP--TPSPMAYQASPSPSPVGYSPMTpgAPSPGGYNPH 821
Cdd:cd21577   41 SSSSSSSSPSSRASPPSPYSKSSPPSPPQQRPLSP-PLSLPPpvAPPPLSPGSVPGGLPVISPVMV--QPVPVLYPPH 115
PHA03269 PHA03269
envelope glycoprotein C; Provisional
736-837 2.02e-03

envelope glycoprotein C; Provisional


Pssm-ID: 165527 [Multi-domain]  Cd Length: 566  Bit Score: 42.02  E-value: 2.02e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 736 TPAMYN---TDQFSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQNTHSPASYHPTPSPMAYQASPSPSPVGYSPMTPGA 812
Cdd:PHA03269  28 IPELHTsaaTQKPDPAPAPHQAASRAPDPAVAPTSAASRKPDLAQAPTPAASEKFDPAPAPHQAASRAPDPAVAPQLAAA 107
                         90       100
                 ....*....|....*....|....*
gi 149056473 813 PSPggyNPHTPGSGIEQNSSDWVTT 837
Cdd:PHA03269 108 PKP---DAAEAFTSAAQAHEAPADA 129
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
699-816 2.26e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.90  E-value: 2.26e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 699 PTPSPQAYGGTPNPQTPgypDPSSPQVNPQYNPQTPGTPAmyntdqfSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQN 778
Cdd:PRK07764 404 AAPAAAPAPAAAAPAAA---AAPAPAAAPQPAPAPAPAPA-------PPSPAGNAPAGGAPSPPPAAAPSAQPAPAPAAA 473
                         90       100       110
                 ....*....|....*....|....*....|....*...
gi 149056473 779 THSPASyhPTPSPMAYQASPSPSPVgysPMTPGAPSPG 816
Cdd:PRK07764 474 PEPTAA--PAPAPPAAPAPAAAPAA---PAAPAAPAGA 506
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
697-823 3.44e-03

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 41.29  E-value: 3.44e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 697 DEPTPSPQAYGGTPNPQTPGYPDPSSPQVNPQynPQTPGTPAMYNTDQFSPYAAPSPQ-GSYQPSPSPQSYH-QVAPSPA 774
Cdd:NF033839 370 EKPKPEVKPQPETPKPEVKPQPEKPKPEVKPQ--PEKPKPEVKPQPEKPKPEVKPQPEkPKPEVKPQPEKPKpEVKPQPE 447
                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|...
gi 149056473 775 GYQNTHSPASYHPTPSPMAYQASPSPSpVGYSPMTP----GAPSPGGYNPHTP 823
Cdd:NF033839 448 KPKPEVKPQPETPKPEVKPQPEKPKPE-VKPQPEKPkpdnSKPQADDKKPSTP 499
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
636-774 4.02e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.12  E-value: 4.02e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 636 GSRTPMYGSQTPLQDGSRTPHYGSQTPLHDGSRTPAQSGAWDPNNPNTPSRAEEEYEyafddeptPSPQAYGGTPNPQTP 715
Cdd:PRK07764 674 GGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASA--------PSPAADDPVPLPPEP 745
                         90       100       110       120       130
                 ....*....|....*....|....*....|....*....|....*....|....*....
gi 149056473 716 GYPDPSSPQVNPQYNPQTPGTPAmyntdqfspyAAPSPQGSYQPSPSPQSYHQVAPSPA 774
Cdd:PRK07764 746 DDPPDPAGAPAQPPPPPAPAPAA----------APAAAPPPSPPSEEEEMAEDDAPSMD 794
KREPA2 cd23959
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ...
624-815 4.11e-03

Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.


Pssm-ID: 467780 [Multi-domain]  Cd Length: 424  Bit Score: 40.62  E-value: 4.11e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 624 PMYGSQTPMYGSGSRTPMYGSQTPLQDGSRTPHYGSQTPLhDGSRTPAQSGAWDPNNP----NTPSRAEEEYEYAFDDEP 699
Cdd:cd23959   56 PLYGAVSPEGENPFDGPGLVTASTVSDCYVGNANFYEVDM-SDAFAMAPDESLGPFRAarvpNPFSASSSTQRETHKTAQ 134
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 700 TPSPQAYGGTPnPQTPGYPDPSSPQVNPqynPQTPGTPAMYNTDQ-FSPYAAPSPQGSYQPSPSPQSYHQVAPSPAGYQN 778
Cdd:cd23959  135 VAPPKAEPQTA-PVTPFGQLPMFGQHPP---PAKPLPAAAAAQQSsASPGEVASPFASGTVSASPFATATDTAPSSGAPD 210
                        170       180       190
                 ....*....|....*....|....*....|....*..
gi 149056473 779 THSPASyhPTPSPMAyqASPSPSPVGYSPMTPGAPSP 815
Cdd:cd23959  211 GFPAEA--SAPSPFA--APASAASFPAAPVANGEAAT 243
KOW pfam00467
KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, ...
455-486 5.18e-03

KOW motif; This family has been extended to coincide with ref. The KOW (Kyprides, Ouzounis, Woese) motif is found in a variety of ribosomal proteins and NusG.


Pssm-ID: 425698 [Multi-domain]  Cd Length: 32  Bit Score: 35.44  E-value: 5.18e-03
                          10        20        30
                  ....*....|....*....|....*....|..
gi 149056473  455 VKDIVKVIDGPHSGREGEIRHLYRSFAFLHCK 486
Cdd:pfam00467   1 KGDVVRVIAGPFKGKVGKVVEVDDKKNRVLVE 32
KLF1_N cd21581
N-terminal domain of Kruppel-like Factor 1; Kruppel-like Factor 1 (KLF1, also known as ...
627-821 9.21e-03

N-terminal domain of Kruppel-like Factor 1; Kruppel-like Factor 1 (KLF1, also known as Krueppel-like factor 1 or Erythroid Kruppel-like Factor/EKLF) was the first Kruppel-like factor discovered. It was found to be vitally important for embryonic erythropoiesis in promoting the switch from fetal hemoglobin (Hemoglobin F) to adult hemoglobin (Hemoglobin A) gene expression by binding to highly conserved CACCC domains. EKLF ablation in mouse embryos produces a lethal anemic phenotype, causing death by embryonic day 14, and natural mutations lead to beta+ thalassemia in humans. However, expression of embryonic hemoglobin and fetal hemoglobin genes is normal in EKLF-deficient mice, suggesting other factors may be involved. KLF1 functions as a transcriptional activator. It belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF1, which is related to the N-terminal domains of KLF2 and KLF4.


Pssm-ID: 409227 [Multi-domain]  Cd Length: 278  Bit Score: 39.26  E-value: 9.21e-03
                         10        20        30        40        50        60        70        80
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 627 GSQTPMYGSGSRTPMYGSQTPLQdgsrtPHYGSQTPLHDGSRTpaqsgAWDPnnpntpsraeeeyEYAFDDEPTPSPQAy 706
Cdd:cd21581   26 EGQLPLDGPPDKLSPSGSEQLQV-----SQPMTEELLDDDSQA-----SWDI-------------EFLLSNWSSPSLNP- 81
                         90       100       110       120       130       140       150       160
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149056473 707 GGTPNPQTPGYPDPSSPQVNPQYNPQTPGTPAMYN----------TDQFSPYAAPSPQ-GSYQPSPSPQSYHQVAP---- 771
Cdd:cd21581   82 SLDNNTQALPQEEQPGAYYEPPKKDQPGTEGLQVGgpglmaellsPEESTGWAPPEPHhGYPDAFVGPALFPAPANvdqf 161
                        170       180       190       200       210       220
                 ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 149056473 772 --SPAGYQNTH---------SPASYHPT--PSPMAYQASPSPSPVGYSPMTPgAPSPGGYNPH 821
Cdd:cd21581  162 gfPQGGSVDRRgnlsksgswDFGSYYPQqhPSVVAFPDSRFGPLSGPQALTP-DPQHYGYFQL 223
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH