NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|2462559789|ref|XP_054174268|]
View 

activity-dependent neuroprotector homeobox protein 2 isoform X2 [Homo sapiens]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
ADNP_N super family cl45031
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ...
391-859 3.68e-49

Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.


The actual alignment was detected with superfamily member pfam19627:

Pssm-ID: 466132 [Multi-domain]  Cd Length: 744  Bit Score: 187.36  E-value: 3.68e-49
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  391 PSGLLSPNQTVSSSAVVPVNQGVNSGVL------QLSQPVVSGV-LPVGQPVRPGVLQLNQTVGTNILpvNQPVR---PG 460
Cdd:pfam19627  268 PLMLIAPKPQDKKSLGVTQKGGLVTGNVrslssqQMNRLSIPKAnLLSNVHLKQGSYGLKSMPSFYVL--GQQVRlslPG 345
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  461 ASQNTTFLTSGSIlRQLIPTGkqvNGIPTYTLAPVSVTLP----VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMP 535
Cdd:pfam19627  346 NAQVSVPQQSQTV-KQLLPGG---NGRPSTVGSSQSGQQParfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKS 421
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  536 SppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNELFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLK 615
Cdd:pfam19627  422 S--------------ASAAGLNTSYTQK----WKICTICNELFPENVYSAHFEKEHK-----------AEKVPAVANYIM 472
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  616 WMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHDIKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLF 695
Cdd:pfam19627  473 KIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFNDVEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKN 551
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  696 PHLDFITILPKEKLGEREVYLA---------ILAGIHSKSLVPVyvKVRPQaegTPGSTGKRV--STCPFCF----GPFv 760
Cdd:pfam19627  552 IQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSDVPV--KSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI- 625
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  761 tTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNMTLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLL 838
Cdd:pfam19627  626 -SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNMTASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAP 704
                          490       500
                   ....*....|....*....|..
gi 2462559789  839 VSGEVMH-DSSFSVKRKLPDGH 859
Cdd:pfam19627  705 LKRELEHvDPALPKKRKLDDEE 726
PHA03247 super family cl33720
large tegument protein UL36; Provisional
145-538 2.05e-14

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 78.44  E-value: 2.05e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  145 PAAHLAAPANGSAPSAPAQPPcfhlalpqnsPSPAAGQPVTVAQGAPGSLTHSPPAA--GQSHMTLVSSPLPVGQNSLTL 222
Cdd:PHA03247  2557 PAAPPAAPDRSVPPPRPAPRP----------SEPAVTSRARRPDAPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDP 2626
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  223 QPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPV------NKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPV 296
Cdd:PHA03247  2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPP 2706
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  297 SPSVTPGVLQAVSPgvLSVSRAVPSGVLPAGQMTPAGQMTPAG-VIPGQTATSGVLPTGQMVQSGVLPVGQ-TAPSRVLP 374
Cdd:PHA03247  2707 TPEPAPHALVSATP--LPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAGPPAPAPPAAPaAGPPRRLT 2784
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  375 PGQTAPLRVISAGQVVPSGLLSPNQTVSS-SAVVPVNQGVNSGVlqlsqPVVSGVLPVGQPVRPGVLQLNQTVGTNILPV 453
Cdd:PHA03247  2785 RPAVASLSESRESLPSPWDPADPPAAVLApAAALPPAASPAGPL-----PPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  454 NQPVRPGASQNTTFLTSGsilrqliPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPG 533
Cdd:PHA03247  2860 GDVRRRPPSRSPAAKPAA-------PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932

                   ....*
gi 2462559789  534 MPSPP 538
Cdd:PHA03247  2933 PPPPP 2937
HOX smart00389
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ...
918-971 1.43e-06

Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes


:

Pssm-ID: 197696 [Multi-domain]  Cd Length: 57  Bit Score: 46.09  E-value: 1.43e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 2462559789   918 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 971
Cdd:smart00389    1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
 
Name Accession Description Interval E-value
ADNP_N pfam19627
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ...
391-859 3.68e-49

Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.


Pssm-ID: 466132 [Multi-domain]  Cd Length: 744  Bit Score: 187.36  E-value: 3.68e-49
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  391 PSGLLSPNQTVSSSAVVPVNQGVNSGVL------QLSQPVVSGV-LPVGQPVRPGVLQLNQTVGTNILpvNQPVR---PG 460
Cdd:pfam19627  268 PLMLIAPKPQDKKSLGVTQKGGLVTGNVrslssqQMNRLSIPKAnLLSNVHLKQGSYGLKSMPSFYVL--GQQVRlslPG 345
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  461 ASQNTTFLTSGSIlRQLIPTGkqvNGIPTYTLAPVSVTLP----VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMP 535
Cdd:pfam19627  346 NAQVSVPQQSQTV-KQLLPGG---NGRPSTVGSSQSGQQParfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKS 421
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  536 SppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNELFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLK 615
Cdd:pfam19627  422 S--------------ASAAGLNTSYTQK----WKICTICNELFPENVYSAHFEKEHK-----------AEKVPAVANYIM 472
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  616 WMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHDIKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLF 695
Cdd:pfam19627  473 KIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFNDVEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKN 551
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  696 PHLDFITILPKEKLGEREVYLA---------ILAGIHSKSLVPVyvKVRPQaegTPGSTGKRV--STCPFCF----GPFv 760
Cdd:pfam19627  552 IQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSDVPV--KSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI- 625
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  761 tTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNMTLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLL 838
Cdd:pfam19627  626 -SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNMTASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAP 704
                          490       500
                   ....*....|....*....|..
gi 2462559789  839 VSGEVMH-DSSFSVKRKLPDGH 859
Cdd:pfam19627  705 LKRELEHvDPALPKKRKLDDEE 726
PHA03247 PHA03247
large tegument protein UL36; Provisional
145-538 2.05e-14

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 78.44  E-value: 2.05e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  145 PAAHLAAPANGSAPSAPAQPPcfhlalpqnsPSPAAGQPVTVAQGAPGSLTHSPPAA--GQSHMTLVSSPLPVGQNSLTL 222
Cdd:PHA03247  2557 PAAPPAAPDRSVPPPRPAPRP----------SEPAVTSRARRPDAPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDP 2626
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  223 QPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPV------NKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPV 296
Cdd:PHA03247  2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPP 2706
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  297 SPSVTPGVLQAVSPgvLSVSRAVPSGVLPAGQMTPAGQMTPAG-VIPGQTATSGVLPTGQMVQSGVLPVGQ-TAPSRVLP 374
Cdd:PHA03247  2707 TPEPAPHALVSATP--LPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAGPPAPAPPAAPaAGPPRRLT 2784
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  375 PGQTAPLRVISAGQVVPSGLLSPNQTVSS-SAVVPVNQGVNSGVlqlsqPVVSGVLPVGQPVRPGVLQLNQTVGTNILPV 453
Cdd:PHA03247  2785 RPAVASLSESRESLPSPWDPADPPAAVLApAAALPPAASPAGPL-----PPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  454 NQPVRPGASQNTTFLTSGsilrqliPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPG 533
Cdd:PHA03247  2860 GDVRRRPPSRSPAAKPAA-------PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932

                   ....*
gi 2462559789  534 MPSPP 538
Cdd:PHA03247  2933 PPPPP 2937
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
176-539 6.91e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 53.62  E-value: 6.91e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  176 PSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSlTLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGP 255
Cdd:pfam03154  149 PSPQDNESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTT-QAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAP 227
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  256 VNKSVGTSVLPINQ--TVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAG 333
Cdd:pfam03154  228 HTLIQQTPTLHPQRlpSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSSQS 307
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  334 QM--TPAGVIPGQTATSGVLPTGQ-MVQSGVLPVGQTAPSRVLP-----PGQTAPLRVISAGQ--------VVPSGLLSP 397
Cdd:pfam03154  308 QVppGPSPAAPGQSQQRIHTPPSQsQLQSQQPPREQPLPPAPLSmphikPPPTTPIPQLPNPQshkhpphlSGPSPFQMN 387
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  398 NQTVSSSAVVPVNQGVNSGVLQLSQPVVSgVLPVGQPVRPGVLQlnqtvgTNILPVNQPVRPGASQNTTFLTSGSILRQl 477
Cdd:pfam03154  388 SNLPPPPALKPLSSLSTHHPPSAHPPPLQ-LMPQSQQLPPPPAQ------PPVLTQSQSLPPPAASHPPTSGLHQVPSQ- 459
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 2462559789  478 iptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPV 539
Cdd:pfam03154  460 -------SPFPQHPFVPGGPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPL 514
HOX smart00389
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ...
918-971 1.43e-06

Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes


Pssm-ID: 197696 [Multi-domain]  Cd Length: 57  Bit Score: 46.09  E-value: 1.43e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 2462559789   918 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 971
Cdd:smart00389    1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
259-576 1.86e-06

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 51.18  E-value: 1.86e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  259 SVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSpgVLSVSRAVPSGVLPAGQMTPAG-QMTP 337
Cdd:cd22553     35 ETHDPLILSPPLSQPQQIITAQSSGSAAGGVAYSVSPAVQTVTVDGHEAIF--IPANSGLLQTNNQQAIQLAPGGtQAIL 112
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  338 AGvipGQTATSGVLPTGQMVQSGVLPV-GQTAPSRV---LPP---GQTAPLRV-ISA--GQVVPSGLLSPNQTVSSSAVV 407
Cdd:cd22553    113 AN---QQTLIRPNTVQGQANASNVLQNiAQIASGGNavqLPLnnmTQTIPVQVpVSTanGQTVYQTIQVPIQAIQSGNAG 189
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  408 PVNQGVNSGVL-QLSQPvvsgvlpvgqpvrpGVLQLNQTVGTNILPVNQPVRPGASQNTTFL------TSGSILRQLIPT 480
Cdd:cd22553    190 GGNQALQAQVIpQLAQA--------------AQLQPQQLAQVSSQGYIQQIPANASQQQPQMvqqgpnQSGQIIGQVASA 255
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  481 -GKQVNGIPTYTLApvSVTLPVPPGGLATVAP-PQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQASSSAADT 558
Cdd:cd22553    256 sSIQAAAIPLTVYT--GALAGQNGSNQQQVGQiVTSPIQGMTQGLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQL 333
                          330       340       350
                   ....*....|....*....|....*....|....*.
gi 2462559789  559 NQVLKQAKQWK------------------TCPVCNE 576
Cdd:cd22553    334 QQDPNDPTKWQvvadgtpgskkrlrrvacTCPNCRD 369
homeodomain cd00086
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic ...
919-975 6.99e-06

Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.


Pssm-ID: 238039 [Multi-domain]  Cd Length: 59  Bit Score: 44.16  E-value: 6.99e-06
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 2462559789  919 KKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRYICMK 975
Cdd:cd00086      1 RRKRTRFTPEQLEELEKEFEKNPYPSREEREELAKELGLTERQVKIWFQNRRAKLKR 57
Soli_cterm TIGR03437
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in ...
340-542 2.86e-05

Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria.


Pssm-ID: 274578 [Multi-domain]  Cd Length: 215  Bit Score: 46.50  E-value: 2.86e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  340 VIPGQTAT---SGVLPTGQMVQSGVLPVgQTAPSRVLPPGQTAPLRVISAGQV---VPSGLLSPNQTVsssaVVpVNQGV 413
Cdd:TIGR03437    2 VAPGSIVSifgTNLAPATLTAAGGPLPT-SLGGVSVTVNGVAAPLLYVSPGQInaqVPYEVAPGAATV----TV-TYNGG 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  414 NSGVLQLS-QPVVSGVLPVGQ-PVRPGVLQLNQtvGTNILPVNQPVRPGaSQNTTFLTSGSILRQLIPTGKQVNGIPTY- 490
Cdd:TIGR03437   76 ASAAVTVTvAAAAPGIFTLDGsGTGQAAALNNQ--DGSVNSAANPAAPG-DVVVLYATGLGPTSPAVADGAPAPSSPLAp 152
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 2462559789  491 TLAPVSVTL-----PVPPGGLATVAPPQMPIQL-LPSGAAA---PMAGSMPGMPSPPVLVN 542
Cdd:TIGR03437  153 ALAPVTVTIggvpaTVLYAGLAPGFVGLYQVNVrVPAGLATgavPVVITVGGVTSNAVTIA 213
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
135-476 1.01e-04

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 45.79  E-value: 1.01e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  135 LLKQTHIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAGQP--VTVAQ---GAPGSLTHSPPAAGQShMTL- 208
Cdd:cd22553      1 FNQSQQVAPSELAQVATTASNIGGQQKQAQSDSSETHDPLILSPPLSQPqqIITAQssgSAAGGVAYSVSPAVQT-VTVd 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  209 ----VSSPLPVGQNSLTLQPPAPQPVFLSHGVPLHQSVnppvlpLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGP 284
Cdd:cd22553     80 gheaIFIPANSGLLQTNNQQAIQLAPGGTQAILANQQT------LIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNN 153
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  285 INRPVgPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAGQMTPAGVI-PGQTATsgvlPTGQMVQSGVLP 363
Cdd:cd22553    154 MTQTI-PVQVPVSTANGQTVYQTIQVPIQAIQSGNAGGGNQALQAQVIPQLAQAAQLqPQQLAQ----VSSQGYIQQIPA 228
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  364 VGQTAPSRVLPPGQTaplrviSAGQVVPSGL-LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGVLQL 442
Cdd:cd22553    229 NASQQQPQMVQQGPN------QSGQIIGQVAsASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPA 302
                          330       340       350
                   ....*....|....*....|....*....|....
gi 2462559789  443 NQTVGTNILPvNQPVRPGASQNTTFLTSGSILRQ 476
Cdd:cd22553    303 SSSIPTVVQQ-QAIQGNPLPPGTQIIAAGQQLQQ 335
PPE COG5651
PPE-repeat protein [Function unknown];
237-439 1.07e-04

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 45.65  E-value: 1.07e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  237 PLHQSVNPPVLPLSQPVGPVNKSVGTSV--LPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLS 314
Cdd:COG5651    170 PPPTITNPGGLLGAQNAGSGNTSSNPGFanLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGFAGTGAAAGAAAAAAAAAA 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  315 VSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLRVISAGqVVPSGL 394
Cdd:COG5651    250 AAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLGAGGAAGAAGAT-GAGAAL 328
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*
gi 2462559789  395 LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGV 439
Cdd:COG5651    329 GAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGA 373
PPE COG5651
PPE-repeat protein [Function unknown];
306-541 6.80e-03

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 39.88  E-value: 6.80e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  306 QAVSpgvLSVSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATS---GVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLR 382
Cdd:COG5651    155 AAAS---AAAVALTPFTQPPPTITNPGGLLGAQNAGSGNTSSNpgfANLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGF 231
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  383 VISAGQVVPSGLLSPNQTVSSSAVVPVNQGvnsgvLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPVRPGAS 462
Cdd:COG5651    232 AGTGAAAGAAAAAAAAAAAAGAGASAALAS-----LAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAAT 306
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 2462559789  463 QNTTFLTSGSILRQLIPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPVLV 541
Cdd:COG5651    307 GLGLGAGGAAGAAGATGAGAALGAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGAAAGAASGGGAAA 385
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
281-397 9.88e-03

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 39.67  E-value: 9.88e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  281 PVGPINRPVGPGVLPVSPSVTPGvlqAVSPGVLSVSRAVPSGVLPAGQMTPAgqmTPAGVIPGQTATSGVLPTGQMVQSG 360
Cdd:TIGR01645  284 PPDALLQPATVSAIPAAAAVAAA---AATAKIMAAEAVAGAAVLGPRAQSPA---TPSSSLPTDIGNKAVVSSAKKEAEE 357
                           90       100       110
                   ....*....|....*....|....*....|....*..
gi 2462559789  361 VLPVGQTAPSRVLPPGQTAPLRVISAGQVVPSGLLSP 397
Cdd:TIGR01645  358 VPPLPQAAPAVVKPGPMEIPTPVPPPGLAIPSLVAPP 394
 
Name Accession Description Interval E-value
ADNP_N pfam19627
Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the ...
391-859 3.68e-49

Activity-dependent neuroprotector homeobox protein N-terminal; This entry represent the N-terminal domain of Activity-dependent neuroprotector homeobox protein (ADNP, also known as Activity- dependent neuroprotective protein), which contains zinc finger motifs. It is involved in transcriptional regulation and it is vital for mammalian brain formation. In humans, de novo mutations result in a syndromic form of autism-like spectrum disorder (ASD), including cognitive and motor deficits, the ADNP syndrome. This protein is also related to autophagy and the pathophysiology of schizophrenia.


Pssm-ID: 466132 [Multi-domain]  Cd Length: 744  Bit Score: 187.36  E-value: 3.68e-49
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  391 PSGLLSPNQTVSSSAVVPVNQGVNSGVL------QLSQPVVSGV-LPVGQPVRPGVLQLNQTVGTNILpvNQPVR---PG 460
Cdd:pfam19627  268 PLMLIAPKPQDKKSLGVTQKGGLVTGNVrslssqQMNRLSIPKAnLLSNVHLKQGSYGLKSMPSFYVL--GQQVRlslPG 345
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  461 ASQNTTFLTSGSIlRQLIPTGkqvNGIPTYTLAPVSVTLP----VPPGGLATVAPPQM-PIQLLPSGAAAPMAGSMPGMP 535
Cdd:pfam19627  346 NAQVSVPQQSQTV-KQLLPGG---NGRPSTVGSSQSGQQParfsVQSGNSASSSSSQLkSPPLSSSVAATRALGQGPSKS 421
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  536 SppvlvnaaqsvfvqASSSAADTNQVLKqakqWKTCPVCNELFPSNVYQVHMEVAHKhsesksgeklePEKLAACAPFLK 615
Cdd:pfam19627  422 S--------------ASAAGLNTSYTQK----WKICTICNELFPENVYSAHFEKEHK-----------AEKVPAVANYIM 472
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  616 WMREKTVRCLSCKCLVSEEELIHHLLMHGLGCLFCPCTFHDIKGLSEHSRNRHLGKKKLPMDYSNRGFQLDVDaNGNLLF 695
Cdd:pfam19627  473 KIHNFTSKCLYCNRYLPSDTLLNHMLIHGLSCPYCRSTFNDVEKMVAHMRMVHPDEEVGPRTDSPLTFDLTLQ-QGNPKN 551
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  696 PHLDFITILPKEKLGEREVYLA---------ILAGIHSKSLVPVyvKVRPQaegTPGSTGKRV--STCPFCF----GPFv 760
Cdd:pfam19627  552 IQLLVTTYNMRDAPEESVAFHAqnnspqpkkPKPKVQEKSDVPV--KSSPQ---AAVPYKKDVgkTLCPLCFsilkGPI- 625
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  761 tTEAYELHLKERHHIMPTVHTVLKSPAFKCIHCCGVYTGNMTLAAIAVHLVRCRSAPK--DSSSDLQAQPGFIHNSELLL 838
Cdd:pfam19627  626 -SDALAHHLRERHQVIQTVHPVEKKLTYKCIHCLGVYTSNMTASTITLHLVHCRGVGKtqNGQDKSAPSPRVTQSPGAAP 704
                          490       500
                   ....*....|....*....|..
gi 2462559789  839 VSGEVMH-DSSFSVKRKLPDGH 859
Cdd:pfam19627  705 LKRELEHvDPALPKKRKLDDEE 726
PHA03247 PHA03247
large tegument protein UL36; Provisional
145-538 2.05e-14

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 78.44  E-value: 2.05e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  145 PAAHLAAPANGSAPSAPAQPPcfhlalpqnsPSPAAGQPVTVAQGAPGSLTHSPPAA--GQSHMTLVSSPLPVGQNSLTL 222
Cdd:PHA03247  2557 PAAPPAAPDRSVPPPRPAPRP----------SEPAVTSRARRPDAPPQSARPRAPVDdrGDPRGPAPPSPLPPDTHAPDP 2626
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  223 QPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPV------NKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPV 296
Cdd:PHA03247  2627 PPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrprrARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPP 2706
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  297 SPSVTPGVLQAVSPgvLSVSRAVPSGVLPAGQMTPAGQMTPAG-VIPGQTATSGVLPTGQMVQSGVLPVGQ-TAPSRVLP 374
Cdd:PHA03247  2707 TPEPAPHALVSATP--LPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAGPPAPAPPAAPaAGPPRRLT 2784
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  375 PGQTAPLRVISAGQVVPSGLLSPNQTVSS-SAVVPVNQGVNSGVlqlsqPVVSGVLPVGQPVRPGVLQLNQTVGTNILPV 453
Cdd:PHA03247  2785 RPAVASLSESRESLPSPWDPADPPAAVLApAAALPPAASPAGPL-----PPPTSAQPTAPPPPPGPPPPSLPLGGSVAPG 2859
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  454 NQPVRPGASQNTTFLTSGsilrqliPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPG 533
Cdd:PHA03247  2860 GDVRRRPPSRSPAAKPAA-------PARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPP 2932

                   ....*
gi 2462559789  534 MPSPP 538
Cdd:PHA03247  2933 PPPPP 2937
PHA03247 PHA03247
large tegument protein UL36; Provisional
143-537 3.28e-13

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 74.59  E-value: 3.28e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  143 PKPAAHLAAPANGSA---PSAPAQP--PCFHLALPQNSPSPAAGQPVTVAQGAPgsltHSPPAAGQSHmtlvSSPLPVGQ 217
Cdd:PHA03247  2571 PRPAPRPSEPAVTSRarrPDAPPQSarPRAPVDDRGDPRGPAPPSPLPPDTHAP----DPPPPSPSPA----ANEPDPHP 2642
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  218 NSLTLQPPAPQPVFLSHGVPLHQSVNP---PVLPLSQPVGPVNKSVGTSVLPINQTVRPGvlPLTQPVGPINRPVGPGVl 294
Cdd:PHA03247  2643 PPTVPPPERPRDDPAPGRVSRPRRARRlgrAAQASSPPQRPRRRAARPTVGSLTSLADPP--PPPPTPEPAPHALVSAT- 2719
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  295 PVSPSVTPGVLQAVSPGVLSVSRAVPSG-VLPAGQMTPAGQMTPAGviPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVL 373
Cdd:PHA03247  2720 PLPPGPAAARQASPALPAAPAPPAVPAGpATPGGPARPARPPTTAG--PPAPAPPAAPAAGPPRRLTRPAVASLSESRES 2797
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  374 PPGQTAPLRViSAGQVVPSGLLSPNQTVSS-----SAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGVLQlnQTVGT 448
Cdd:PHA03247  2798 LPSPWDPADP-PAAVLAPAAALPPAASPAGplpppTSAQPTAPPPPPGPPPPSLPLGGSVAPGGDVRRRPPSR--SPAAK 2874
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  449 NILPVNQPVR----PGASQNTTFLTSGSILRQLIPTGK-QVNGIPTYTLAPVSVTLPVPPgglatvAPPQMPIQLLPSGA 523
Cdd:PHA03247  2875 PAAPARPPVRrlarPAVSRSTESFALPPDQPERPPQPQaPPPPQPQPQPPPPPQPQPPPP------PPPRPQPPLAPTTD 2948
                          410
                   ....*....|....
gi 2462559789  524 AAPMAGSMPGMPSP 537
Cdd:PHA03247  2949 PAGAGEPSGAVPQP 2962
PHA03247 PHA03247
large tegument protein UL36; Provisional
143-389 6.04e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 60.34  E-value: 6.04e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  143 PKPAAhLAAPANGSAPSAPAQPPCFHLALPQNSP-SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLT 221
Cdd:PHA03247  2758 ARPPT-TAGPPAPAPPAAPAAGPPRRLTRPAVASlSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQP 2836
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  222 LQPPAPqPVFLSHGVPLHQSVNP--PVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGpiNRPVGPGVLPVSPS 299
Cdd:PHA03247  2837 TAPPPP-PGPPPPSLPLGGSVAPggDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFA--LPPDQPERPPQPQA 2913
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  300 VTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTP----AGQMTPAGVIPgQTATSGVLPTGQMVQSGVLPvgQTAPSRVLPP 375
Cdd:PHA03247  2914 PPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPttdpAGAGEPSGAVP-QPWLGALVPGRVAVPRFRVP--QPAPSREAPA 2990
                          250
                   ....*....|....
gi 2462559789  376 GQTAPLRVISAGQV 389
Cdd:PHA03247  2991 SSTPPLTGHSLSRV 3004
PHA03247 PHA03247
large tegument protein UL36; Provisional
142-382 2.52e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 58.41  E-value: 2.52e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  142 APKPAAHLAAPANGSAPSAPAQPPcfhlALPQNSPSPAAGQPVTVAQGAPGS-LTHSPPAAGQSHMTLVSsPLPVGQNSL 220
Cdd:PHA03247  2688 ARPTVGSLTSLADPPPPPPTPEPA----PHALVSATPLPPGPAAARQASPALpAAPAPPAVPAGPATPGG-PARPARPPT 2762
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  221 TLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQ-PVGPINRPvgPGVLPVSPS 299
Cdd:PHA03247  2763 TAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAsPAGPLPPP--TSAQPTAPP 2840
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  300 VTPGVLQAvspgvlsvSRAVPSGVLPAGqmtPAGQMTPAGVIPGQTATSGVLPTGQMVQSgvlPVGQTAPSRVLPPGQTA 379
Cdd:PHA03247  2841 PPPGPPPP--------SLPLGGSVAPGG---DVRRRPPSRSPAAKPAAPARPPVRRLARP---AVSRSTESFALPPDQPE 2906

                   ...
gi 2462559789  380 PLR 382
Cdd:PHA03247  2907 RPP 2909
PHA03379 PHA03379
EBNA-3A; Provisional
130-461 3.67e-07

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 54.29  E-value: 3.67e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  130 IKRTGllKQTHIAPKPAAHLAAPANGSaPSAPAQPPcfHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMtlv 209
Cdd:PHA03379   391 LMRAG--KLTERAREALEKASEPTYGT-PRPPVEKP--RPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSM--- 462
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  210 sSPLPVGQNsltlqPPAP----QPVFLSHGVPlhQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPI 285
Cdd:PHA03379   463 -APCPVAQL-----PPGPlqdlEPGDQLPGVV--QDGRPACAPVPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPV 534
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  286 NRPVGPGVLPVSPSVTPGVLQAvsPGVLSVSRAV----------PSGVLPAGQMT---------PAGQMTPAGV------ 340
Cdd:PHA03379   535 PVPTVALERPVCPAPPLIAMQG--PGETSGIVRVrerwrpapwtPNPPRSPSQMSvrdrlarlrAEAQPYQASVevqppq 612
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  341 ---IPGQTATSGVL-PTGQM------------VQSGVLPVGQtAPSRVLPPGQ-------TAPLRViSAGQVVPSGLLSP 397
Cdd:PHA03379   613 ltqVSPQQPMEYPLePEQQMfpgspfsqvadvMRAGGVPAMQ-PQYFDLPLQQpisqgapLAPLRA-SMGPVPPVPATQP 690
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 2462559789  398 nQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLP---------VGQPVRPGVLQlNQTVGtniLPVNQPVRPGA 461
Cdd:PHA03379   691 -QYFDIPLTEPINQGASAAHFLPQQPMEGPLVPerwmfqgatLSQSVRPGVAQ-SQYFD---LPLTQPINHGA 758
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
176-539 6.91e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 53.62  E-value: 6.91e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  176 PSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSlTLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPVGP 255
Cdd:pfam03154  149 PSPQDNESDSDSSAQQQILQTQPPVLQAQSGAASPPSPPPPGTT-QAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAP 227
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  256 VNKSVGTSVLPINQ--TVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAG 333
Cdd:pfam03154  228 HTLIQQTPTLHPQRlpSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSSQS 307
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  334 QM--TPAGVIPGQTATSGVLPTGQ-MVQSGVLPVGQTAPSRVLP-----PGQTAPLRVISAGQ--------VVPSGLLSP 397
Cdd:pfam03154  308 QVppGPSPAAPGQSQQRIHTPPSQsQLQSQQPPREQPLPPAPLSmphikPPPTTPIPQLPNPQshkhpphlSGPSPFQMN 387
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  398 NQTVSSSAVVPVNQGVNSGVLQLSQPVVSgVLPVGQPVRPGVLQlnqtvgTNILPVNQPVRPGASQNTTFLTSGSILRQl 477
Cdd:pfam03154  388 SNLPPPPALKPLSSLSTHHPPSAHPPPLQ-LMPQSQQLPPPPAQ------PPVLTQSQSLPPPAASHPPTSGLHQVPSQ- 459
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 2462559789  478 iptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPV 539
Cdd:pfam03154  460 -------SPFPQHPFVPGGPPPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPL 514
HOX smart00389
Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key ...
918-971 1.43e-06

Homeodomain; DNA-binding factors that are involved in the transcriptional regulation of key developmental processes


Pssm-ID: 197696 [Multi-domain]  Cd Length: 57  Bit Score: 46.09  E-value: 1.43e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 2462559789   918 PKKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRY 971
Cdd:smart00389    1 KRRKRTSFTPEQLEELEKEFQKNPYPSREEREELAKKLGLSERQVKVWFQNRRA 54
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
259-576 1.86e-06

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 51.18  E-value: 1.86e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  259 SVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSpgVLSVSRAVPSGVLPAGQMTPAG-QMTP 337
Cdd:cd22553     35 ETHDPLILSPPLSQPQQIITAQSSGSAAGGVAYSVSPAVQTVTVDGHEAIF--IPANSGLLQTNNQQAIQLAPGGtQAIL 112
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  338 AGvipGQTATSGVLPTGQMVQSGVLPV-GQTAPSRV---LPP---GQTAPLRV-ISA--GQVVPSGLLSPNQTVSSSAVV 407
Cdd:cd22553    113 AN---QQTLIRPNTVQGQANASNVLQNiAQIASGGNavqLPLnnmTQTIPVQVpVSTanGQTVYQTIQVPIQAIQSGNAG 189
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  408 PVNQGVNSGVL-QLSQPvvsgvlpvgqpvrpGVLQLNQTVGTNILPVNQPVRPGASQNTTFL------TSGSILRQLIPT 480
Cdd:cd22553    190 GGNQALQAQVIpQLAQA--------------AQLQPQQLAQVSSQGYIQQIPANASQQQPQMvqqgpnQSGQIIGQVASA 255
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  481 -GKQVNGIPTYTLApvSVTLPVPPGGLATVAP-PQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQASSSAADT 558
Cdd:cd22553    256 sSIQAAAIPLTVYT--GALAGQNGSNQQQVGQiVTSPIQGMTQGLTAPASSSIPTVVQQQAIQGNPLPPGTQIIAAGQQL 333
                          330       340       350
                   ....*....|....*....|....*....|....*.
gi 2462559789  559 NQVLKQAKQWK------------------TCPVCNE 576
Cdd:cd22553    334 QQDPNDPTKWQvvadgtpgskkrlrrvacTCPNCRD 369
homeodomain cd00086
Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic ...
919-975 6.99e-06

Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner.


Pssm-ID: 238039 [Multi-domain]  Cd Length: 59  Bit Score: 44.16  E-value: 6.99e-06
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 2462559789  919 KKYEGRSYEEKKQFLKDYFHKKPYPSKKEIELLSSLFWVWKIDVASFFGKRRYICMK 975
Cdd:cd00086      1 RRKRTRFTPEQLEELEKEFEKNPYPSREEREELAKELGLTERQVKIWFQNRRAKLKR 57
PHA03247 PHA03247
large tegument protein UL36; Provisional
152-556 1.10e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 49.94  E-value: 1.10e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  152 PANGSAPSAPAQPPCfhlalPQNSPSPAAGQPVTVAQGAPGSLTHSPPA-AGQSHMTLVSSPLPVGQNSLTLQPPAPQPV 230
Cdd:PHA03247  2483 PAEARFPFAAGAAPD-----PGGGGPPDPDAPPAPSRLAPAILPDEPVGePVHPRMLTWIRGLEELASDDAGDPPPPLPP 2557
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  231 FLSHGVPlHQSVnPPVLPLSQPVGPVNKSvgtsvlpinQTVRPGVLPltQPvgpiNRPVGPGVLPVSPsvtpgvlqavsP 310
Cdd:PHA03247  2558 AAPPAAP-DRSV-PPPRPAPRPSEPAVTS---------RARRPDAPP--QS----ARPRAPVDDRGDP-----------R 2609
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  311 GVLSVSRAVPSGVLPAgqmTPAGQMTPAGVIPGQTATSGVLPTGQmvqsgvlPVGQTAPSRVLPPgqtapLRVISAGQvv 390
Cdd:PHA03247  2610 GPAPPSPLPPDTHAPD---PPPPSPSPAANEPDPHPPPTVPPPER-------PRDDPAPGRVSRP-----RRARRLGR-- 2672
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  391 PSGLLSPNQTVSSSAVVPVNQGVNSgvlqLSQPVVSGVLPVGQPvRPGVLQLNQTVGTNILPVNQPVRPGASqnttfLTS 470
Cdd:PHA03247  2673 AAQASSPPQRPRRRAARPTVGSLTS----LADPPPPPPTPEPAP-HALVSATPLPPGPAAARQASPALPAAP-----APP 2742
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  471 GSILRQLIPTGKQVNGIPTYTLAPVSvtlPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPVLVNAAQSVFVQ 550
Cdd:PHA03247  2743 AVPAGPATPGGPARPARPPTTAGPPA---PAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALP 2819

                   ....*.
gi 2462559789  551 ASSSAA 556
Cdd:PHA03247  2820 PAASPA 2825
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
138-555 1.15e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 49.38  E-value: 1.15e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  138 QTHIAPKPAAHLAAPANGSAPSaPAQPPCFHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPaagQSHMTLVSSPLPVGQ 217
Cdd:pfam03154  175 QAQSGAASPPSPPPPGTTQAAT-AGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTP---TLHPQRLPSPHPPLQ 250
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  218 nSLTLQPPAPQPVFLSHGVPLHQSVNPPvLPLSQPVGP--VNKSVGTSVLPI-NQTVRPGVLPLTQPVGPI---NRPVGP 291
Cdd:pfam03154  251 -PMTQPPPPSQVSPQPLPQPSLHGQMPP-MPHSLQTGPshMQHPVPPQPFPLtPQSSQSQVPPGPSPAAPGqsqQRIHTP 328
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  292 GVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQM-TPAGQMTPAGVipgqtatSGVLPTgQMvqsgvlpvgqtaPS 370
Cdd:pfam03154  329 PSQSQLQSQQPPREQPLPPAPLSMPHIKPPPTTPIPQLpNPQSHKHPPHL-------SGPSPF-QM------------NS 388
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  371 RVLPPGQTAPLRVISAGQVvPSGLLSPnqtvsssavvpvnqgvnsgvLQLsqpvvsgvLPVGQPVRPgvlqlnqtvgtni 450
Cdd:pfam03154  389 NLPPPPALKPLSSLSTHHP-PSAHPPP--------------------LQL--------MPQSQQLPP------------- 426
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  451 lPVNQPvrPGASQNTTFLTSGSilrqliptgkqvNGIPTYTLAPVSVTLPVPPGGLATVAPPQMpiqLLPSGAAAPMAGS 530
Cdd:pfam03154  427 -PPAQP--PVLTQSQSLPPPAA------------SHPPTSGLHQVPSQSPFPQHPFVPGGPPPI---TPPSGPPTSTSSA 488
                          410       420
                   ....*....|....*....|....*
gi 2462559789  531 MPGMpSPPVLVNAAQSVFVQASSSA 555
Cdd:pfam03154  489 MPGI-QPPSSASVSSSGPVPAAVSC 512
PHA03247 PHA03247
large tegument protein UL36; Provisional
141-309 2.63e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.40  E-value: 2.63e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  141 IAPKPAAHLAAPANGSAPSAPAQPPCFHLA--------LPQNSP--SPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVS 210
Cdd:PHA03247  2828 LPPPTSAQPTAPPPPPGPPPPSLPLGGSVApggdvrrrPPSRSPaaKPAAPARPPVRRLARPAVSRSTESFALPPDQPER 2907
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  211 SPLPVGQNSLTLQPPAPQPVFLSHGVPLHQSVNPPVLPLSQPvGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVG 290
Cdd:PHA03247  2908 PPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDP-AGAGEPSGAVPQPWLGALVPGRVAVPRFRVPQPAPSR 2986
                          170
                   ....*....|....*....
gi 2462559789  291 PGVLPVSPSVTPGVLQAVS 309
Cdd:PHA03247  2987 EAPASSTPPLTGHSLSRVS 3005
Soli_cterm TIGR03437
Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in ...
340-542 2.86e-05

Solibacter uncharacterized C-terminal domain; This model describes a protein domain found in 90 proteins of Solibacter usitatus Ellin6076, nearly always as the C-terminal domain of a much larger protein. No homologs to this domain are detected outside of S. usitatus, a member of the Acidobacteria.


Pssm-ID: 274578 [Multi-domain]  Cd Length: 215  Bit Score: 46.50  E-value: 2.86e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  340 VIPGQTAT---SGVLPTGQMVQSGVLPVgQTAPSRVLPPGQTAPLRVISAGQV---VPSGLLSPNQTVsssaVVpVNQGV 413
Cdd:TIGR03437    2 VAPGSIVSifgTNLAPATLTAAGGPLPT-SLGGVSVTVNGVAAPLLYVSPGQInaqVPYEVAPGAATV----TV-TYNGG 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  414 NSGVLQLS-QPVVSGVLPVGQ-PVRPGVLQLNQtvGTNILPVNQPVRPGaSQNTTFLTSGSILRQLIPTGKQVNGIPTY- 490
Cdd:TIGR03437   76 ASAAVTVTvAAAAPGIFTLDGsGTGQAAALNNQ--DGSVNSAANPAAPG-DVVVLYATGLGPTSPAVADGAPAPSSPLAp 152
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 2462559789  491 TLAPVSVTL-----PVPPGGLATVAPPQMPIQL-LPSGAAA---PMAGSMPGMPSPPVLVN 542
Cdd:TIGR03437  153 ALAPVTVTIggvpaTVLYAGLAPGFVGLYQVNVrVPAGLATgavPVVITVGGVTSNAVTIA 213
DUF4813 pfam16072
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. ...
320-556 9.81e-05

Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.


Pssm-ID: 435117 [Multi-domain]  Cd Length: 288  Bit Score: 45.52  E-value: 9.81e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  320 PSGVLPAGqmtpaGQMTPAGVIPGqtaTSGVLPTGqmvqsGVlPVGQT----APSRVLPPGQTaplrVISAGQVVPSGLL 395
Cdd:pfam16072   13 PGGYAPAG-----ATYHPAGQVPA---GATYYPSG-----GV-PHGATyypqAPVAAVPAGAT----YLPAGAAIPAGAT 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  396 SPNQTVSSSAVVPVNQGVNSG---------VLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPvrPGASQNTT 466
Cdd:pfam16072   75 YYPQAPKSSSGLGLGTGLIAGalggailghALTPTQTRVVEHAPSSGGGGGGGGYSNGNNEDKIIIINNG--PPGSVTTT 152
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  467 FLTSGSilrQLIPTGKQVNGiptytlAPVSVTLPVPPGGLATVAPPQMPIQlLPSGAAAPMAGSMPGMPSPPVLVNAAQS 546
Cdd:pfam16072  153 SAGSGT---TVINAGGQQPA------APAAPAYPVAPAAYPAQAPAAAPAP-APGAPQTPLAPLNPVAAAPAAAAGAAAA 222
                          250
                   ....*....|
gi 2462559789  547 VFVQASSSAA 556
Cdd:pfam16072  223 PVVAAAAPAA 232
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
135-476 1.01e-04

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 45.79  E-value: 1.01e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  135 LLKQTHIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAGQP--VTVAQ---GAPGSLTHSPPAAGQShMTL- 208
Cdd:cd22553      1 FNQSQQVAPSELAQVATTASNIGGQQKQAQSDSSETHDPLILSPPLSQPqqIITAQssgSAAGGVAYSVSPAVQT-VTVd 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  209 ----VSSPLPVGQNSLTLQPPAPQPVFLSHGVPLHQSVnppvlpLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGP 284
Cdd:cd22553     80 gheaIFIPANSGLLQTNNQQAIQLAPGGTQAILANQQT------LIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNN 153
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  285 INRPVgPGVLPVSPSVTPGVLQAVSPGVLSVSRAVPSGVLPAGQMTPAGQMTPAGVI-PGQTATsgvlPTGQMVQSGVLP 363
Cdd:cd22553    154 MTQTI-PVQVPVSTANGQTVYQTIQVPIQAIQSGNAGGGNQALQAQVIPQLAQAAQLqPQQLAQ----VSSQGYIQQIPA 228
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  364 VGQTAPSRVLPPGQTaplrviSAGQVVPSGL-LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGVLQL 442
Cdd:cd22553    229 NASQQQPQMVQQGPN------QSGQIIGQVAsASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPA 302
                          330       340       350
                   ....*....|....*....|....*....|....
gi 2462559789  443 NQTVGTNILPvNQPVRPGASQNTTFLTSGSILRQ 476
Cdd:cd22553    303 SSSIPTVVQQ-QAIQGNPLPPGTQIIAAGQQLQQ 335
PPE COG5651
PPE-repeat protein [Function unknown];
237-439 1.07e-04

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 45.65  E-value: 1.07e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  237 PLHQSVNPPVLPLSQPVGPVNKSVGTSV--LPINQTVRPGVLPLTQPVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLS 314
Cdd:COG5651    170 PPPTITNPGGLLGAQNAGSGNTSSNPGFanLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGFAGTGAAAGAAAAAAAAAA 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  315 VSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATSGVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLRVISAGqVVPSGL 394
Cdd:COG5651    250 AAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAATGLGLGAGGAAGAAGAT-GAGAAL 328
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*
gi 2462559789  395 LSPNQTVSSSAVVPVNQGVNSGVLQLSQPVVSGVLPVGQPVRPGV 439
Cdd:COG5651    329 GAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGA 373
PHA02682 PHA02682
ORF080 virion core protein; Provisional
143-250 1.23e-04

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 45.24  E-value: 1.23e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  143 PKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAGQPvtvAQGAPGSLTHSPPAagqshmtlvsSPLPvgqnslTL 222
Cdd:PHA02682    96 PACAPAAPAPAVTCPAPAPACPPATAPTCPPPAVCPAPARP---APACPPSTRQCPPA----------PPLP------TP 156
                           90       100
                   ....*....|....*....|....*....
gi 2462559789  223 QP-PAPQPVFlshgvpLHQSVNPPVLPLS 250
Cdd:PHA02682   157 KPaPAAKPIF------LHNQLPPPDYPAA 179
PRK10263 PRK10263
DNA translocase FtsK; Provisional
143-375 2.13e-04

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 45.46  E-value: 2.13e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  143 PKPAAHLAAPAngSAPSAPAQPPCFHLALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQ---SHMTLVSSPLPVGQNS 219
Cdd:PRK10263   362 PVPGPQTGEPV--IAPAPEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYyapAPEQPAQQPYYAPAPE 439
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  220 LTL-----QPPAPQPVFLSHgvPLHQSVNPPVLPLSQPVG-----PVNKSVGTSVLPINQTVRPGVLPL----------- 278
Cdd:PRK10263   440 QPVagnawQAEEQQSTFAPQ--STYQTEQTYQQPAAQEPLyqqpqPVEQQPVVEPEPVVEETKPARPPLyyfeeveekra 517
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  279 ---------TQPV-GPI--NRPVGPGVLPVSPSVTPGVLQAvsPGVLSVSRAVPSGVLPAGqmTPAGQMTPAGvipgqTA 346
Cdd:PRK10263   518 rereqlaawYQPIpEPVkePEPIKSSLKAPSVAAVPPVEAA--AAVSPLASGVKKATLATG--AAATVAAPVF-----SL 588
                          250       260
                   ....*....|....*....|....*....
gi 2462559789  347 TSGVLPTGQmVQSGVLPvGQTAPSRVLPP 375
Cdd:PRK10263   589 ANSGGPRPQ-VKEGIGP-QLPRPKRIRVP 615
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
142-288 3.65e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 44.64  E-value: 3.65e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  142 APKPAAhlaaPANGSAPSAPAQPPCFHLALPQNSPSPAAGQ--PVTVAQGAPGSLTHSPPAA----GQSHMTLVSSPLPV 215
Cdd:pfam09770  207 AKKPAQ----QPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQqqPQQQPQQPQQHPGQGHPVTilqrPQSPQPDPAQPSIQ 282
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 2462559789  216 GQNSLTLQPPAPQPVflshgVPLHQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVR-PGVLPltQPVGPINRP 288
Cdd:pfam09770  283 PQAQQFHQQPPPVPV-----QPTQILQNPNRLSAARVGYPQNPQPGVQPAPAHQAHRqQGSFG--RQAPIITHP 349
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
139-309 6.35e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 43.99  E-value: 6.35e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  139 THIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPQNSPSPAAgQPVTVAQgapgSLTHSPPAAgqshmtlvSSPLPVGQN 218
Cdd:pfam03154  388 SNLPPPPALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPA-QPPVLTQ----SQSLPPPAA--------SHPPTSGLH 454
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  219 SLTLQPPAPQPVFLSHGVPL------HQSVNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPINRPVGPG 292
Cdd:pfam03154  455 QVPSQSPFPQHPFVPGGPPPitppsgPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAEEPESPP 534
                          170
                   ....*....|....*..
gi 2462559789  293 VLPVSPSVTPGVLQAVS 309
Cdd:pfam03154  535 PPPRSPSPEPTVVNTPS 551
PHA03247 PHA03247
large tegument protein UL36; Provisional
145-540 1.25e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 43.00  E-value: 1.25e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  145 PAAHLAAP-ANGSAPSAPAQPPCFHLALPQNSPSPAAGQPVTV----------------AQGAPGSLTHS--PPAAGQSH 205
Cdd:PHA03247  2489 PFAAGAAPdPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPrmltwirgleelasddAGDPPPPLPPAapPAAPDRSV 2568
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  206 MTLVSSPLPVGqnsltlqpPAPQPVFLSHGVPLHQsvNPPVLPLSQPVGPVNKSVGTSVLPINQTVRPgvlPLTQPVGPI 285
Cdd:PHA03247  2569 PPPRPAPRPSE--------PAVTSRARRPDAPPQS--ARPRAPVDDRGDPRGPAPPSPLPPDTHAPDP---PPPSPSPAA 2635
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  286 NRPVGPGVLPVSPSVTPGvlQAVSPGVLSVSRAVPSGVLPAGQMTPAGQMTPAGVIPGqtatsgVLPTGQMVQSGVLPVG 365
Cdd:PHA03247  2636 NEPDPHPPPTVPPPERPR--DDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPT------VGSLTSLADPPPPPPT 2707
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  366 QTAPSRVLPPGQTAPLRVISAGQVVPSGLLSPnqtvsssavvpvnqgvnsgvlqLSQPVVSGVLPVGQPVRPGVLQLNQT 445
Cdd:PHA03247  2708 PEPAPHALVSATPLPPGPAAARQASPALPAAP----------------------APPAVPAGPATPGGPARPARPPTTAG 2765
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  446 VGTNILPVNQPVRPGASQNTTFLTSGSILRQLIPTGKQVngiptytlAPVSVTLPVPPGGLATVAPPQMPiqLLPSGAAA 525
Cdd:PHA03247  2766 PPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDP--------ADPPAAVLAPAAALPPAASPAGP--LPPPTSAQ 2835
                          410
                   ....*....|....*
gi 2462559789  526 PMAGSMPGMPSPPVL 540
Cdd:PHA03247  2836 PTAPPPPPGPPPPSL 2850
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
138-316 1.38e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 42.56  E-value: 1.38e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  138 QTHIAPKPAAHLAAPANGSAPSAPAQPPCFHLALPqnSPSPAAGQPVTVAQGAPGSLTHSPPAAgqshmtlvSSPLPVGQ 217
Cdd:PRK12323   392 PAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARR--SPAPEALAAARQASARGPGGAPAPAPA--------PAAAPAAA 461
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  218 NSLTLQPPAPQPVFLSHGVPLHQSVNPPV-----------LPLSQPV-GPVNKSVGTSVLPINQTVRPGVLPL-----TQ 280
Cdd:PRK12323   462 ARPAAAGPRPVAAAAAAAPARAAPAAAPApadddpppweeLPPEFASpAPAQPDAAPAGWVAESIPDPATADPddafeTL 541
                          170       180       190
                   ....*....|....*....|....*....|....*.
gi 2462559789  281 PVGPINRPVGPGVLPVSPSVTPGVLQAVSPGVLSVS 316
Cdd:PRK12323   542 APAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMF 577
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
144-308 2.41e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 41.77  E-value: 2.41e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  144 KPAAHLAAPANGSAPSAPAqppcfhlALPQNSPSPAAGQPVTVAQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLTLQ 223
Cdd:PRK07994   360 HPAAPLPEPEVPPQSAAPA-------ASAQATAAPTAAVAPPQAPAVPPPPASAPQQAPAVPLPETTSQLLAARQQLQRA 432
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  224 PPAPQPvflshgvplhqsvnppvlPLSQPVGPVNKSVGTSVLPINQTVRPGVLPLTQPVGPIN----RPVGPGVLPVSPS 299
Cdd:PRK07994   433 QGATKA------------------KKSEPAAASRARPVNSALERLASVRPAPSALEKAPAKKEayrwKATNPVEVKKEPV 494

                   ....*....
gi 2462559789  300 VTPGVLQAV 308
Cdd:PRK07994   495 ATPKALKKA 503
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
122-409 6.17e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 40.52  E-value: 6.17e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  122 LRSVISEHIKRtglLKQTHIAPKPAAHLAAPANgsAPSAPAQPPCFHLALP------QNSPS----PAAGQPVTV----- 186
Cdd:pfam03154  231 IQQTPTLHPQR---LPSPHPPLQPMTQPPPPSQ--VSPQPLPQPSLHGQMPpmphslQTGPShmqhPVPPQPFPLtpqss 305
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  187 -AQGAPGSLTHSPPAAGQSHMTLVSSPLPVGQNSLTLQPPAPQPVFLSHGVPlhqsvnPPVLPLSQpvgpvnksvgtsvL 265
Cdd:pfam03154  306 qSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSMPHIKP------PPTTPIPQ-------------L 366
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  266 PINQTVR-PGVLPLTQPVG-PINRPVGPGVLPVS-------PSVTPGVLQAVSPGV-LSVSRAVPSGVLPAGQMTPAGQM 335
Cdd:pfam03154  367 PNPQSHKhPPHLSGPSPFQmNSNLPPPPALKPLSslsthhpPSAHPPPLQLMPQSQqLPPPPAQPPVLTQSQSLPPPAAS 446
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 2462559789  336 TPAGVIPGQTATSGVLPTGQMVQSGVLPVgqTAPSRVlPPGQTAPLRVISAGQVVPSGLLSPNQTVSSSAVVPV 409
Cdd:pfam03154  447 HPPTSGLHQVPSQSPFPQHPFVPGGPPPI--TPPSGP-PTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPV 517
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
188-586 6.18e-03

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 40.38  E-value: 6.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  188 QGAPGSLTHSPPAAGQSHMtlVSSPLPVGQN--SLTLQPPAPQPVFLSHGVPLHQSVNPPVL--PLSQPVGPVNKSVGTS 263
Cdd:pfam09606   60 QQQPQGGQGNGGMGGGQQG--MPDPINALQNlaGQGTRPQMMGPMGPGPGGPMGQQMGGPGTasNLLASLGRPQMPMGGA 137
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  264 VLPINQTvrpGVLPLTQPVGpinrpVGPGVLPVSPSVTPGVLQAvspgvlsvsravpsgvlPAGQMTPAGQMTPaGVIPG 343
Cdd:pfam09606  138 GFPSQMS---RVGRMQPGGQ-----AGGMMQPSSGQPGSGTPNQ-----------------MGPNGGPGQGQAG-GMNGG 191
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  344 QTATSGVLPTGQMVQSGVL-------PVGQTAPSRVLPP---GQTAPLRVISAGQVVPSGllsPNQTVSSSAVVPVNQgV 413
Cdd:pfam09606  192 QQGPMGGQMPPQMGVPGMPgpadagaQMGQQAQANGGMNpqqMGGAPNQVAMQQQQPQQQ---GQQSQLGMGINQMQQ-M 267
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  414 NSGVLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPVRPgasqnttfltsgsilRQlipTGKQVNGIPTytlA 493
Cdd:pfam09606  268 PQGVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQ---------------QQ---QQQGGNHPAA---H 326
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  494 PVSVTLPVPPGGLATVAPPQMPIQLLPSGA-----AAPMAGSMPGMPSPPVLVNAAQSVFVQasssaadTNQVLKQAKQw 568
Cdd:pfam09606  327 QQQMNQSVGQGGQVVALGGLNHLETWNPGNfgglgANPMQRGQPGMMSSPSPVPGQQVRQVT-------PNQFMRQSPQ- 398
                          410
                   ....*....|....*...
gi 2462559789  569 ktcpvcnelfPSNVYQVH 586
Cdd:pfam09606  399 ----------PSVPSPQG 406
PPE COG5651
PPE-repeat protein [Function unknown];
306-541 6.80e-03

PPE-repeat protein [Function unknown];


Pssm-ID: 444372 [Multi-domain]  Cd Length: 385  Bit Score: 39.88  E-value: 6.80e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  306 QAVSpgvLSVSRAVPSGVLPAGQMTPAGQMTPAGVIPGQTATS---GVLPTGQMVQSGVLPVGQTAPSRVLPPGQTAPLR 382
Cdd:COG5651    155 AAAS---AAAVALTPFTQPPPTITNPGGLLGAQNAGSGNTSSNpgfANLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGF 231
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  383 VISAGQVVPSGLLSPNQTVSSSAVVPVNQGvnsgvLQLSQPVVSGVLPVGQPVRPGVLQLNQTVGTNILPVNQPVRPGAS 462
Cdd:COG5651    232 AGTGAAAGAAAAAAAAAAAAGAGASAALAS-----LAATLLNASSLGLAATAASSAATNLGLAGSPLGLAGGGAGAAAAT 306
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 2462559789  463 QNTTFLTSGSILRQLIPTGKQVNGIPTYTLAPVSVTLPVPPGGLATVAPPQMPIQLLPSGAAAPMAGSMPGMPSPPVLV 541
Cdd:COG5651    307 GLGLGAGGAAGAAGATGAGAALGAGAAAAAAGAAAGAGAAAAAAAGGAGGGGGGALGAGGGGGSAGAAAGAASGGGAAA 385
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
281-397 9.88e-03

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 39.67  E-value: 9.88e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2462559789  281 PVGPINRPVGPGVLPVSPSVTPGvlqAVSPGVLSVSRAVPSGVLPAGQMTPAgqmTPAGVIPGQTATSGVLPTGQMVQSG 360
Cdd:TIGR01645  284 PPDALLQPATVSAIPAAAAVAAA---AATAKIMAAEAVAGAAVLGPRAQSPA---TPSSSLPTDIGNKAVVSSAKKEAEE 357
                           90       100       110
                   ....*....|....*....|....*....|....*..
gi 2462559789  361 VLPVGQTAPSRVLPPGQTAPLRVISAGQVVPSGLLSP 397
Cdd:TIGR01645  358 VPPLPQAAPAVVKPGPMEIPTPVPPPGLAIPSLVAPP 394
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH