NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|124505249|ref|XP_001351366|]
View 

pre-mRNA-processing-splicing factor 8, putative [Plasmodium falciparum 3D7]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
PRP8 super family cl34928
U5 snRNP spliceosome subunit [RNA processing and modification];
400-3135 0e+00

U5 snRNP spliceosome subunit [RNA processing and modification];


The actual alignment was detected with superfamily member COG5178:

Pssm-ID: 227505 [Multi-domain]  Cd Length: 2365  Bit Score: 3069.56  E-value: 0e+00
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  400 VVEEKEEMPCEHLRKIVKEHGDMSNKKYRYDKRVYLGALKYIPHAVFKLLENIPMPWEQIKNTKVIYHITGAITFVNETF 479
Cdd:COG5178    77 VLTLKAPIPPEHLRKIQSPCSDMPSVLTKVDKRSYLGALKYLPHAVLKLLENMPSPWEDVSEVKVLYHCHGAITFVNEVP 156
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  480 VVIDPLYIAQWGTMWIMMRREKRDRKHFKRMRFPPFDDEEPPLDYADNILDIEPLECIQMKLDKDEDKSVIDWFYDSKPL 559
Cdd:COG5178   157 RVIEPQLFAQWGLCWSPMRREKRDRYSFKRKRFPPFDDLEPPLSKSQWVLGVEPLMPINIRLDRMDDEHVRDWVYTSRDL 236
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  560 lYNRNHIPGTSYKKYKLSLEQMGVLYRLGNQLFSDFQDDNYFYLFNLKSFYTAKALNMAIPGGPKFEPLYRDIYEDDEDW 639
Cdd:COG5178   237 -EDHPSVNGAMYRRWKYMLPAMHNLLRLMPMLWESIRDVNYVYLFSGLSFFVAKALNVAIPGGPKFEPLYSRESAEFEDE 315
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  640 NEFNDINKIIIRQQIRTEYKIAFPYLYNNRPRKIAVSKYHSPMCVYIKL-EDIDLPPFYFDLIINPIPSYKIrkfnksse 718
Cdd:COG5178   316 NEFNGIVRIIRRPPIDDEYPVAFPGLYNSRPRSVAVECYGSPECRDVFLdEDEDYPANFKDPLINPILGVQL-------- 387
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  719 kkdselfdddfyltytrkeiyyydhgdddkkkkstsksrkhskhsdaDDNRYDkgyrkyrkssssyksfkrdkrksTNSS 798
Cdd:COG5178   388 -----------------------------------------------DNHPYD-----------------------GKGS 397
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  799 NDKDIDEEDYNSGVSSIDnndnsdtyissskynsnnmssrtsknkdetyeidstvendshdgslkkeknkkkrknpyndd 878
Cdd:COG5178   398 NEESCVMERKLFSEPIFY-------------------------------------------------------------- 415
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  879 nykgddknksddddnykgddnndnnnkyksdnissckknkkmiikhveygilPLLHNYPLYTERTINGIQLYHAPYPFNK 958
Cdd:COG5178   416 ----------------------------------------------------PYLYNESTEVRTTERAHLLLKNPFPFNK 443
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  959 KCGYTRRGIDIPLVQSWFKEHISTKYPVKVRVSYQKLLKCWVLNHLHSKRPKSMKKKYLFRIFKSTKFFQCTEMDWVEVG 1038
Cdd:COG5178   444 GKGRAERAQDVPLDKPWLLGHCLQERPVKVPVSYQKLLKNYVRNMLHKTRPRPHTNTHLLKELKNTKYFQRTEIDWVEAG 523
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1039 LQVCRQGYNMLNLLIHRKNLNYLHLDYNFNLKPVKTLTTKERKKSRFGNAFHLCREILRLTKSIVDSHVQYRLGNIDAYQ 1118
Cdd:COG5178   524 LQLCRQGHNMLSLLIHRKGLTYLHLDYNFNLKPTKTLTTKERKKSRVGNSFHLMREMLKFIKLIVDIHVQFRLGNIDAYQ 603
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1119 LADGIQYIFAHVGQLTGMYRYKYRLMRQVRMCKDLKHLIYYRFNTGsVGKGPGCGLWAPLWRVWIFFLRGVIPLLERWLS 1198
Cdd:COG5178   604 LADGVHYILNHVGQLTGIYRYKYKLMKQIRACKDWKHLIYYAFNEG-VGKGPGCGFWGPQWRVWLFFLRGHIPLLERYIG 682
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1199 NLLARQFEGRvSKGIAKTVTKQRVESHFDLELRAAVMHDIIDMIPEGLKnnKGKARLILQHLSEAWRCWKANIPWKVVGL 1278
Cdd:COG5178   683 NLVTRQFEGR-SDYNPKPLTKQRSDSGYDLELRRQVMADILSMLPEGIR--QTKVRTILQHLSEAWRCWKANIPWHVPGE 759
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1279 PLPVENIIIRYIKLKADWWVNATYYNRERIKRGATVDKTVCKKNLGRLTRLWLKAEQERQHEYLKDGPYVSGEEAVALYT 1358
Cdd:COG5178   760 PAPILEVIRRYIKSKADLWTSSAHFNRERISRGAGVGKTKEKKNLGRLTRLWVKLEQERQVDSAKVGPKSTKEEAKRIGK 839
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1359 TAIHWFESRKFTHIPFPPLNYKHDTKLLILALEKLKETFTVKNRLNQSQREELGFIEQAYDNPYETLSRIKRHLLTQRAF 1438
Cdd:COG5178   840 ITVLWLESRMFEPIPFPPLRYKEDTKILVLALEYLKSKYTGKIRLNESTREELALLEKAYDNPHDTLFRIKKSLLTQRSF 919
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1439 KEISISFLDLYTHLVPVYEVDPLEKITDAYLDQYLWYEGDLRNLFPNWVKPSDNEPQPLLVYKMCQGINNLHNIWDTKNN 1518
Cdd:COG5178   920 KEVGITLMRHYDGAIPVYSVDPVEKIVDAYLDQYLWYEADRRNLFPEWIKPSDSEMPPLLVYKWCQGINNLKAAWDTSNG 999
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1519 ECVVMLQTQFSKIYEKIDLTLLNRLLRLIVDHNIADYITAKNNTNITFKDMNHINSFGIIRGLQFSSFVFQYYTIIIDLL 1598
Cdd:COG5178  1000 ERLVLYETKLEGIMEKVDNTLLNRLLKLVLDPNLADYIIAKNNVVVVYKDMSHTNHYGLIRGLQFSSFIYQFYGLVVDLL 1079
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1599 ILGLTRAYDIAGPYNDVNQFLTFQNVQIETRHPIRLYCRYVDKIWILFKFTNEESKDLIQKFLTENPDPNNENIVGYNN- 1677
Cdd:COG5178  1080 VLGLQRATEIAGPADAPNVFMDFKSRATETSHPIRLYTRYMDDIYIVFRFQRKEEDSLLEDYLRENPDPEEANNERYRNy 1159
                        1290      1300      1310      1320      1330      1340      1350      1360
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1678 -KTCWPRDCRMRKMKHDVNLGRATFWEIQNRIPRSLTSLDWDhyNTFVSVYSKDNPNLLFSIAGFEVRILPKIRQlsygy 1756
Cdd:COG5178  1160 fKGCWPDDCRMRLGPLDVNLGRAVFWEILRRCPHSLTATRWE--PSFGSVYSKINPNLLFSMVGFEVRILPKIRK----- 1232
                        1370      1380      1390      1400      1410      1420      1430      1440
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1757 ngimytsymneyprgvgtkdetskkngllhddekskkvgslkdevtkgkshvdkNEENsdnnkndnkndsthanthdmvg 1836
Cdd:COG5178  1233 ------------------------------------------------------IEER---------------------- 1236
                        1450      1460      1470      1480      1490      1500      1510      1520
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1837 dnnydggvknnfynssggeknvvvssSVKEGTWKLQNEMTKEITAEAYLKVSDNSMKRFENRVRQILMSSGSTTFTKIAN 1916
Cdd:COG5178  1237 --------------------------SLSSGVWRLGDGRTKQRTAHANLAVSEGGIEMFESRIRHILMTSGSTTFTKVAT 1290
                        1530      1540      1550      1560      1570      1580      1590      1600
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1917 KWNTTLIGLMTYFREAVLDTEELLDLLVKCENKIQTRIKIGLNSKMPSRFPPVVFYTPKELGGLGMLSMGHILIPESDLR 1996
Cdd:COG5178  1291 KWNTQLIALVTYYREAICDTKGLLDKLVKAERLIQNRVKKGLNSKMPVRFPPAVFYAPKELGGLGMLSVGHILIPHSDLE 1370
                        1610      1620      1630      1640      1650      1660      1670      1680
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1997 YMKQTDNGrITHFRSGLSHEEDQLIPNLYRYISTWESEFLESQRVWCEYALKRNECHNQNKKITLEDLEDSWDKGIPRIN 2076
Cdd:COG5178  1371 WSKQTDTG-ITHFRSGMTTNGERLIPAAMRYISRWEYEFEDSQRVWAEYARKRQEAGQQNRRLTLEDLEMSWDRGIPRIS 1449
                        1690      1700      1710      1720      1730      1740      1750      1760
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2077 TLFQKDRHTLAYDKGWRIRQLFKQYQIIKSNPFWWTNQRHDGKLWNLNNYRTDMIQALGGVEGILEHTLFKGTFFPTWEG 2156
Cdd:COG5178  1450 TLFQRDRHTLAYDRGFRMRSEFKQYSLKPNNPFWWTDAKHDGKLWSLNRYRLDVIQALGGVEGILEHTLFKATGFRSWEG 1529
                        1770      1780      1790      1800      1810      1820      1830      1840
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2157 LFWEKASGFEESMKYKKLTNAQRSGLNQIPNRRFTLWWSPTINRANVYVGFQVQLDLTGIFMHGKIPTLKISLIQIFRAH 2236
Cdd:COG5178  1530 LFWEKASGFEESMKFKKLTNAQRMGLSQIPNRRFTLWWSPTINRANVYVGFQVQLDLTGILMHGKIPTLKISLIQIFRNH 1609
                        1850      1860      1870      1880      1890      1900      1910      1920
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2237 LWQKIHESLVMDICQVFDLNCDLLDIETVQKETIHPRKSYKMNSSCADILLFANYKWGISKPSLLTDEDHIFTNntlgst 2316
Cdd:COG5178  1610 LWQKIHESVVGDLCQVLDKELDVLQIETVQKETVHPRKSYKMNSSCADILLSGAYDWCVSSPSLLLEERDGGSN------ 1683
                        1930      1940      1950      1960      1970      1980      1990      2000
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2317 sgtnnnimlnsnminsgsnnsssnnmnsvsfgsfpYTSNQFWIDIQLRWGDFDSHDIERYSRAKFLDYTTDNLSIYPCLT 2396
Cdd:COG5178  1684 -----------------------------------VRTNKLWIDVQLRWGDYDSHDIHRYARAKFLDYTTDPQSMYPSPT 1728
                        2010      2020      2030      2040      2050      2060      2070      2080
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2397 GVLIGVDLAYNLYSAYGNWFNNLKPLMQKALQKIVQSNPSLYVLRERIRKGLQLYSSEPTEPYLNTQNYNELFSSQTIWF 2476
Cdd:COG5178  1729 GVVIGIDLCYNMWSAYGNWNEGLKPLIQSSMERIMKANPALYVLRERIRKGLQLYTSEPQEQYLSSSNYAELFSNSIDLF 1808
                        2090      2100      2110      2120      2130      2140      2150      2160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2477 VDDTNVYRVTIHKTFEGNLTTKPINGAIFILNPKTGQLFLKIIHTSVWIGQKRLSQLAKWKTAEEVASLIRSLPIEEQPK 2556
Cdd:COG5178  1809 VDDTNVYRVTLHKTFEGNLTTKPINGAIFVLNPATGNLFLKVIHTSVWAGQKRLIQLAKWKTAEEVFALGRSLPVEEQPK 1888
                        2170      2180      2190      2200      2210      2220      2230      2240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2557 QIIVTRKGMLDPLEVHLLDFPNIIIKGTELNLPFQALLKLNKIGDLILKATQPQMLLFNLYDDWLNSISSFTAFSRLILI 2636
Cdd:COG5178  1889 QIIVTRKSMLDPLEVHILDFPNISIRTCELALPFSAVMGIDKIRDLILRATEPQMVLFNLYDDWLQETSSYTAFSRLLLV 1968
                        2250      2260      2270      2280      2290      2300      2310      2320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2637 LRSLHINPQQTKILLQPNKNIvTTQPHHIWPSFNNNQWIHLEVQLKDLILNDYSKRNNVHIASLTQNEIRDILLGMEITP 2716
Cdd:COG5178  1969 LRALDVNEERVKEILRPDKSI-ITKINHLWPGFSDSQWIKKEIQLRDLILDRYCSKHNINPSGLTQSEVRDIILGFRISA 2047
                        2330      2340      2350      2360      2370      2380      2390      2400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2717 PSIQRQQIAELEKNNLDLMEQQMKVTTSKTTTKHGNEIIVSTLSPHEQQTFTTKTDWKIRYLANNSLLFRTKNIYVnnnn 2796
Cdd:COG5178  2048 PSGARQETAETEKQNSEKALSRPTNVSTKTINGWGREYVVLDGMIYEGEKFSSKEEWRSEAIRTGPLELRTKNIYV---- 2123
                        2410      2420      2430      2440      2450      2460      2470      2480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2797 msnmsnintiSASASSHNIlnkngtnsdnqnshyhtsinsinDYTYVIAKNLLEKFICISDLKIQVGGFLFGSSPEDNSY 2876
Cdd:COG5178  2124 ----------TADENEESI-----------------------QQMYRLPLNLLEKFMRISDPHVQVAGLVYGKSGSDNPQ 2170
                        2490      2500      2510      2520      2530      2540      2550      2560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2877 VKEIKCILIPPQIGNYQSVTLSSYMPSSK-YLQNLELLGWIHTQTTNCSntnnHLTAYDMVAHfnflqeckrqmskGKKV 2955
Cdd:COG5178  2171 IKEILSFGLVPQLGSLSGVQSSSFVPHDLpGDEDLEILGWIHTQDDELP----YLEVAGVLTH-------------RKKI 2233
                        2570      2580      2590      2600      2610      2620      2630      2640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2956 ADashndddvddyddddynnneddynnnneddninnnseggtkrdetyKMWDknkTIILTCSFTPGSCTINAYKLTSDGY 3035
Cdd:COG5178  2234 VD----------------------------------------------PEWD---AVTLTVSYLPGSISLRAYVVKKEGC 2264
                        2650      2660      2670      2680      2690      2700      2710      2720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 3036 SFakskkNSSDLYVFPNVNNLYEPV-----QILLSNVFVGYFLIPDDHIWNYNLMGIKFNNNQKYAPHLDIPQPFYADIH 3110
Cdd:COG5178  2265 NW-----GSKNMDINSDEAIGVEPVlgkdcQLLLSDRIQGVFYVPEEEVWNYNFAGPFFDDRLEYTWKIGMPLGFYDGFH 2339
                        2730      2740
                  ....*....|....*....|....*
gi 124505249 3111 RPNHFLQFSLLDQRDADEADVETSF 3135
Cdd:COG5178  2340 RPGHFSRFYELRAGGRLEEWQEDAF 2364
Amelogenin super family cl33250
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
114-216 4.53e-08

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


The actual alignment was detected with superfamily member smart00818:

Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 55.18  E-value: 4.53e-08
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249    114 IPYNNMNafPPNMPKLPTN-MPFLPPNMPILPPHLQhmpnvlphlqnMPnVPPHLASFPNMINLPNLP-PHMHNLPPNMH 191
Cdd:smart00818   40 IPVSQQH--PPTHTLQPHHhIPVLPAQQPVVPQQPL-----------MP-VPGQHSMTPTQHHQPNLPqPAQQPFQPQPL 105
                            90       100
                    ....*....|....*....|....*....
gi 124505249    192 SLPPHMHNLPP----NMHSLPPNMNYIPP 216
Cdd:smart00818  106 QPPQPQQPMQPqppvHPIPPLPPQPPLPP 134
 
Name Accession Description Interval E-value
PRP8 COG5178
U5 snRNP spliceosome subunit [RNA processing and modification];
400-3135 0e+00

U5 snRNP spliceosome subunit [RNA processing and modification];


Pssm-ID: 227505 [Multi-domain]  Cd Length: 2365  Bit Score: 3069.56  E-value: 0e+00
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  400 VVEEKEEMPCEHLRKIVKEHGDMSNKKYRYDKRVYLGALKYIPHAVFKLLENIPMPWEQIKNTKVIYHITGAITFVNETF 479
Cdd:COG5178    77 VLTLKAPIPPEHLRKIQSPCSDMPSVLTKVDKRSYLGALKYLPHAVLKLLENMPSPWEDVSEVKVLYHCHGAITFVNEVP 156
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  480 VVIDPLYIAQWGTMWIMMRREKRDRKHFKRMRFPPFDDEEPPLDYADNILDIEPLECIQMKLDKDEDKSVIDWFYDSKPL 559
Cdd:COG5178   157 RVIEPQLFAQWGLCWSPMRREKRDRYSFKRKRFPPFDDLEPPLSKSQWVLGVEPLMPINIRLDRMDDEHVRDWVYTSRDL 236
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  560 lYNRNHIPGTSYKKYKLSLEQMGVLYRLGNQLFSDFQDDNYFYLFNLKSFYTAKALNMAIPGGPKFEPLYRDIYEDDEDW 639
Cdd:COG5178   237 -EDHPSVNGAMYRRWKYMLPAMHNLLRLMPMLWESIRDVNYVYLFSGLSFFVAKALNVAIPGGPKFEPLYSRESAEFEDE 315
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  640 NEFNDINKIIIRQQIRTEYKIAFPYLYNNRPRKIAVSKYHSPMCVYIKL-EDIDLPPFYFDLIINPIPSYKIrkfnksse 718
Cdd:COG5178   316 NEFNGIVRIIRRPPIDDEYPVAFPGLYNSRPRSVAVECYGSPECRDVFLdEDEDYPANFKDPLINPILGVQL-------- 387
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  719 kkdselfdddfyltytrkeiyyydhgdddkkkkstsksrkhskhsdaDDNRYDkgyrkyrkssssyksfkrdkrksTNSS 798
Cdd:COG5178   388 -----------------------------------------------DNHPYD-----------------------GKGS 397
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  799 NDKDIDEEDYNSGVSSIDnndnsdtyissskynsnnmssrtsknkdetyeidstvendshdgslkkeknkkkrknpyndd 878
Cdd:COG5178   398 NEESCVMERKLFSEPIFY-------------------------------------------------------------- 415
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  879 nykgddknksddddnykgddnndnnnkyksdnissckknkkmiikhveygilPLLHNYPLYTERTINGIQLYHAPYPFNK 958
Cdd:COG5178   416 ----------------------------------------------------PYLYNESTEVRTTERAHLLLKNPFPFNK 443
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  959 KCGYTRRGIDIPLVQSWFKEHISTKYPVKVRVSYQKLLKCWVLNHLHSKRPKSMKKKYLFRIFKSTKFFQCTEMDWVEVG 1038
Cdd:COG5178   444 GKGRAERAQDVPLDKPWLLGHCLQERPVKVPVSYQKLLKNYVRNMLHKTRPRPHTNTHLLKELKNTKYFQRTEIDWVEAG 523
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1039 LQVCRQGYNMLNLLIHRKNLNYLHLDYNFNLKPVKTLTTKERKKSRFGNAFHLCREILRLTKSIVDSHVQYRLGNIDAYQ 1118
Cdd:COG5178   524 LQLCRQGHNMLSLLIHRKGLTYLHLDYNFNLKPTKTLTTKERKKSRVGNSFHLMREMLKFIKLIVDIHVQFRLGNIDAYQ 603
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1119 LADGIQYIFAHVGQLTGMYRYKYRLMRQVRMCKDLKHLIYYRFNTGsVGKGPGCGLWAPLWRVWIFFLRGVIPLLERWLS 1198
Cdd:COG5178   604 LADGVHYILNHVGQLTGIYRYKYKLMKQIRACKDWKHLIYYAFNEG-VGKGPGCGFWGPQWRVWLFFLRGHIPLLERYIG 682
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1199 NLLARQFEGRvSKGIAKTVTKQRVESHFDLELRAAVMHDIIDMIPEGLKnnKGKARLILQHLSEAWRCWKANIPWKVVGL 1278
Cdd:COG5178   683 NLVTRQFEGR-SDYNPKPLTKQRSDSGYDLELRRQVMADILSMLPEGIR--QTKVRTILQHLSEAWRCWKANIPWHVPGE 759
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1279 PLPVENIIIRYIKLKADWWVNATYYNRERIKRGATVDKTVCKKNLGRLTRLWLKAEQERQHEYLKDGPYVSGEEAVALYT 1358
Cdd:COG5178   760 PAPILEVIRRYIKSKADLWTSSAHFNRERISRGAGVGKTKEKKNLGRLTRLWVKLEQERQVDSAKVGPKSTKEEAKRIGK 839
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1359 TAIHWFESRKFTHIPFPPLNYKHDTKLLILALEKLKETFTVKNRLNQSQREELGFIEQAYDNPYETLSRIKRHLLTQRAF 1438
Cdd:COG5178   840 ITVLWLESRMFEPIPFPPLRYKEDTKILVLALEYLKSKYTGKIRLNESTREELALLEKAYDNPHDTLFRIKKSLLTQRSF 919
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1439 KEISISFLDLYTHLVPVYEVDPLEKITDAYLDQYLWYEGDLRNLFPNWVKPSDNEPQPLLVYKMCQGINNLHNIWDTKNN 1518
Cdd:COG5178   920 KEVGITLMRHYDGAIPVYSVDPVEKIVDAYLDQYLWYEADRRNLFPEWIKPSDSEMPPLLVYKWCQGINNLKAAWDTSNG 999
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1519 ECVVMLQTQFSKIYEKIDLTLLNRLLRLIVDHNIADYITAKNNTNITFKDMNHINSFGIIRGLQFSSFVFQYYTIIIDLL 1598
Cdd:COG5178  1000 ERLVLYETKLEGIMEKVDNTLLNRLLKLVLDPNLADYIIAKNNVVVVYKDMSHTNHYGLIRGLQFSSFIYQFYGLVVDLL 1079
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1599 ILGLTRAYDIAGPYNDVNQFLTFQNVQIETRHPIRLYCRYVDKIWILFKFTNEESKDLIQKFLTENPDPNNENIVGYNN- 1677
Cdd:COG5178  1080 VLGLQRATEIAGPADAPNVFMDFKSRATETSHPIRLYTRYMDDIYIVFRFQRKEEDSLLEDYLRENPDPEEANNERYRNy 1159
                        1290      1300      1310      1320      1330      1340      1350      1360
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1678 -KTCWPRDCRMRKMKHDVNLGRATFWEIQNRIPRSLTSLDWDhyNTFVSVYSKDNPNLLFSIAGFEVRILPKIRQlsygy 1756
Cdd:COG5178  1160 fKGCWPDDCRMRLGPLDVNLGRAVFWEILRRCPHSLTATRWE--PSFGSVYSKINPNLLFSMVGFEVRILPKIRK----- 1232
                        1370      1380      1390      1400      1410      1420      1430      1440
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1757 ngimytsymneyprgvgtkdetskkngllhddekskkvgslkdevtkgkshvdkNEENsdnnkndnkndsthanthdmvg 1836
Cdd:COG5178  1233 ------------------------------------------------------IEER---------------------- 1236
                        1450      1460      1470      1480      1490      1500      1510      1520
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1837 dnnydggvknnfynssggeknvvvssSVKEGTWKLQNEMTKEITAEAYLKVSDNSMKRFENRVRQILMSSGSTTFTKIAN 1916
Cdd:COG5178  1237 --------------------------SLSSGVWRLGDGRTKQRTAHANLAVSEGGIEMFESRIRHILMTSGSTTFTKVAT 1290
                        1530      1540      1550      1560      1570      1580      1590      1600
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1917 KWNTTLIGLMTYFREAVLDTEELLDLLVKCENKIQTRIKIGLNSKMPSRFPPVVFYTPKELGGLGMLSMGHILIPESDLR 1996
Cdd:COG5178  1291 KWNTQLIALVTYYREAICDTKGLLDKLVKAERLIQNRVKKGLNSKMPVRFPPAVFYAPKELGGLGMLSVGHILIPHSDLE 1370
                        1610      1620      1630      1640      1650      1660      1670      1680
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1997 YMKQTDNGrITHFRSGLSHEEDQLIPNLYRYISTWESEFLESQRVWCEYALKRNECHNQNKKITLEDLEDSWDKGIPRIN 2076
Cdd:COG5178  1371 WSKQTDTG-ITHFRSGMTTNGERLIPAAMRYISRWEYEFEDSQRVWAEYARKRQEAGQQNRRLTLEDLEMSWDRGIPRIS 1449
                        1690      1700      1710      1720      1730      1740      1750      1760
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2077 TLFQKDRHTLAYDKGWRIRQLFKQYQIIKSNPFWWTNQRHDGKLWNLNNYRTDMIQALGGVEGILEHTLFKGTFFPTWEG 2156
Cdd:COG5178  1450 TLFQRDRHTLAYDRGFRMRSEFKQYSLKPNNPFWWTDAKHDGKLWSLNRYRLDVIQALGGVEGILEHTLFKATGFRSWEG 1529
                        1770      1780      1790      1800      1810      1820      1830      1840
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2157 LFWEKASGFEESMKYKKLTNAQRSGLNQIPNRRFTLWWSPTINRANVYVGFQVQLDLTGIFMHGKIPTLKISLIQIFRAH 2236
Cdd:COG5178  1530 LFWEKASGFEESMKFKKLTNAQRMGLSQIPNRRFTLWWSPTINRANVYVGFQVQLDLTGILMHGKIPTLKISLIQIFRNH 1609
                        1850      1860      1870      1880      1890      1900      1910      1920
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2237 LWQKIHESLVMDICQVFDLNCDLLDIETVQKETIHPRKSYKMNSSCADILLFANYKWGISKPSLLTDEDHIFTNntlgst 2316
Cdd:COG5178  1610 LWQKIHESVVGDLCQVLDKELDVLQIETVQKETVHPRKSYKMNSSCADILLSGAYDWCVSSPSLLLEERDGGSN------ 1683
                        1930      1940      1950      1960      1970      1980      1990      2000
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2317 sgtnnnimlnsnminsgsnnsssnnmnsvsfgsfpYTSNQFWIDIQLRWGDFDSHDIERYSRAKFLDYTTDNLSIYPCLT 2396
Cdd:COG5178  1684 -----------------------------------VRTNKLWIDVQLRWGDYDSHDIHRYARAKFLDYTTDPQSMYPSPT 1728
                        2010      2020      2030      2040      2050      2060      2070      2080
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2397 GVLIGVDLAYNLYSAYGNWFNNLKPLMQKALQKIVQSNPSLYVLRERIRKGLQLYSSEPTEPYLNTQNYNELFSSQTIWF 2476
Cdd:COG5178  1729 GVVIGIDLCYNMWSAYGNWNEGLKPLIQSSMERIMKANPALYVLRERIRKGLQLYTSEPQEQYLSSSNYAELFSNSIDLF 1808
                        2090      2100      2110      2120      2130      2140      2150      2160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2477 VDDTNVYRVTIHKTFEGNLTTKPINGAIFILNPKTGQLFLKIIHTSVWIGQKRLSQLAKWKTAEEVASLIRSLPIEEQPK 2556
Cdd:COG5178  1809 VDDTNVYRVTLHKTFEGNLTTKPINGAIFVLNPATGNLFLKVIHTSVWAGQKRLIQLAKWKTAEEVFALGRSLPVEEQPK 1888
                        2170      2180      2190      2200      2210      2220      2230      2240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2557 QIIVTRKGMLDPLEVHLLDFPNIIIKGTELNLPFQALLKLNKIGDLILKATQPQMLLFNLYDDWLNSISSFTAFSRLILI 2636
Cdd:COG5178  1889 QIIVTRKSMLDPLEVHILDFPNISIRTCELALPFSAVMGIDKIRDLILRATEPQMVLFNLYDDWLQETSSYTAFSRLLLV 1968
                        2250      2260      2270      2280      2290      2300      2310      2320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2637 LRSLHINPQQTKILLQPNKNIvTTQPHHIWPSFNNNQWIHLEVQLKDLILNDYSKRNNVHIASLTQNEIRDILLGMEITP 2716
Cdd:COG5178  1969 LRALDVNEERVKEILRPDKSI-ITKINHLWPGFSDSQWIKKEIQLRDLILDRYCSKHNINPSGLTQSEVRDIILGFRISA 2047
                        2330      2340      2350      2360      2370      2380      2390      2400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2717 PSIQRQQIAELEKNNLDLMEQQMKVTTSKTTTKHGNEIIVSTLSPHEQQTFTTKTDWKIRYLANNSLLFRTKNIYVnnnn 2796
Cdd:COG5178  2048 PSGARQETAETEKQNSEKALSRPTNVSTKTINGWGREYVVLDGMIYEGEKFSSKEEWRSEAIRTGPLELRTKNIYV---- 2123
                        2410      2420      2430      2440      2450      2460      2470      2480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2797 msnmsnintiSASASSHNIlnkngtnsdnqnshyhtsinsinDYTYVIAKNLLEKFICISDLKIQVGGFLFGSSPEDNSY 2876
Cdd:COG5178  2124 ----------TADENEESI-----------------------QQMYRLPLNLLEKFMRISDPHVQVAGLVYGKSGSDNPQ 2170
                        2490      2500      2510      2520      2530      2540      2550      2560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2877 VKEIKCILIPPQIGNYQSVTLSSYMPSSK-YLQNLELLGWIHTQTTNCSntnnHLTAYDMVAHfnflqeckrqmskGKKV 2955
Cdd:COG5178  2171 IKEILSFGLVPQLGSLSGVQSSSFVPHDLpGDEDLEILGWIHTQDDELP----YLEVAGVLTH-------------RKKI 2233
                        2570      2580      2590      2600      2610      2620      2630      2640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2956 ADashndddvddyddddynnneddynnnneddninnnseggtkrdetyKMWDknkTIILTCSFTPGSCTINAYKLTSDGY 3035
Cdd:COG5178  2234 VD----------------------------------------------PEWD---AVTLTVSYLPGSISLRAYVVKKEGC 2264
                        2650      2660      2670      2680      2690      2700      2710      2720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 3036 SFakskkNSSDLYVFPNVNNLYEPV-----QILLSNVFVGYFLIPDDHIWNYNLMGIKFNNNQKYAPHLDIPQPFYADIH 3110
Cdd:COG5178  2265 NW-----GSKNMDINSDEAIGVEPVlgkdcQLLLSDRIQGVFYVPEEEVWNYNFAGPFFDDRLEYTWKIGMPLGFYDGFH 2339
                        2730      2740
                  ....*....|....*....|....*
gi 124505249 3111 RPNHFLQFSLLDQRDADEADVETSF 3135
Cdd:COG5178  2340 RPGHFSRFYELRAGGRLEEWQEDAF 2364
PROCN pfam08083
PROCN (NUC071) domain; The PROCN domain is the central domain in pre-mRNA splicing factors of ...
940-1343 0e+00

PROCN (NUC071) domain; The PROCN domain is the central domain in pre-mRNA splicing factors of PRO8 family.


Pssm-ID: 400431  Cd Length: 402  Bit Score: 832.42  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249   940 TERTINGIQLYHAPYPFNKKCGYTRRGIDIPLVQSWFKEHISTKYPVKVRVSYQKLLKCWVLNHLHSKRPKSMKKKYLFR 1019
Cdd:pfam08083    1 NENTKDGISLYWAPYPFNRRSGRTKRAQDVPLIKHWYREHPPSNYPVKVRVSYQKLLKNYVLNELHHRKQKKHKKKNLLK 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1020 IFKSTKFFQCTEMDWVEVGLQVCRQGYNMLNLLIHRKNLNYLHLDYNFNLKPVKTLTTKERKKSRFGNAFHLCREILRLT 1099
Cdd:pfam08083   81 SLKNTKFFQQTTIDWVEAGLQVCRQGHNMLNLLIHRKGLTYLHLDYNFNLKPTKTLTTKERKKSRFGNAFHLIRELLRLT 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1100 KSIVDSHVQYRLGNIDAYQLADGIQYIFAHVGQLTGMYRYKYRLMRQVRMCKDLKHLIYYRFNTGSVGKGPGCGLWAPLW 1179
Cdd:pfam08083  161 KLLVDAHVQYRLGNIDAYQLADGLQYIFNHVGQLTGMYRYKYKVMRQIRACKDLKHVIYSRFNTGEVGKGPGCGFWAPSW 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1180 RVWIFFLRGVIPLLERWLSNLLARQFEGRVSKGIAKTVTKQRVESHFDLELRAAVMHDIIDMIPEGLKNNkgKARLILQH 1259
Cdd:pfam08083  241 RVWIFFLRGIIPLLERWLGNLLARQFEGRKSKDVAKTITKQRVESYYDLELRASVMHDILDMMPEGVKQN--KARTILQH 318
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1260 LSEAWRCWKANIPWKVVGLPLPVENIIIRYIKLKADWWVNATYYNRERIKRGATVDKTVCKKNLGRLTRLWLKAEQERQH 1339
Cdd:pfam08083  319 LSEAWRCWKANIPWKVPGLPEPIENIILRYVKAKADWWTSVAHYNRERIKRGATVDKTVAKKNLGRLTRLWLKAEQERQA 398

                   ....
gi 124505249  1340 EYLK 1343
Cdd:pfam08083  399 NYLK 402
RNase_H_like_Prp8_IV cd13838
Ribonuclease-like Prp8 domain IV core; This family contains Prp8 domain IV, which adopts a ...
2463-2714 2.25e-168

Ribonuclease-like Prp8 domain IV core; This family contains Prp8 domain IV, which adopts a RNase H like fold within its core structure but with little sequence similarity. Prp8, a spliceosome protein, interacts directly with the splice sites and branch regions of precursor-mRNAs and spliceosomal RNAs associated with catalysis of the two steps of splicing. Catalysis of RNA cleavage by RNase H-like proteins involves a two-metal mechanism in which adjacently-bound divalent magnesium ions promote hydrolysis by activation of a water nucleophile and stabilization of the transition-state. However, the Prp8 domain IV contains only one of the canonical metal-binding sites and the coordinating side chains are spatially conserved with respect to Mg2+-coordinating residues within the RNase H fold.


Pssm-ID: 260013  Cd Length: 251  Bit Score: 518.44  E-value: 2.25e-168
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2463 QNYNELFSSQTIWFVDDTNVYRVTIHKTFEGNLTTKPINGAIFILNPKTGQLFLKIIHTSVWIGQKRLSQLAKWKTAEEV 2542
Cdd:cd13838     1 QNYGELFSNQIIWFVDDTNVYRVTIHKTFEGNLTTKPINGAIFIFNPRTGQLFLKIIHTSVWAGQKRLGQLAKWKTAEEV 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2543 ASLIRSLPIEEQPKQIIVTRKGMLDPLEVHLLDFPNIIIKGTELNLPFQALLKLNKIGDLILKATQPQMLLFNLYDDWLN 2622
Cdd:cd13838    81 AALIRSLPVEEQPKQIIVTRKGMLDPLEVHLLDFPNIVIKGSELQLPFQACLKIEKFGDLILKATEPQMVLFNLYDDWLK 160
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2623 SISSFTAFSRLILILRSLHINPQQTKILLQPNKNIVtTQPHHIWPSFNNNQWIHLEVQLKDLILNDYSKRNNVHIASLTQ 2702
Cdd:cd13838   161 TISSYTAFSRLILILRALHVNNEKAKIILKPDKTVI-TEPHHIWPTLSDEEWIKVEVQLKDLILADYGKKNNVNVASLTQ 239
                         250
                  ....*....|..
gi 124505249 2703 NEIRDILLGMEI 2714
Cdd:cd13838   240 SEIRDIILGMEI 251
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
114-216 4.53e-08

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 55.18  E-value: 4.53e-08
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249    114 IPYNNMNafPPNMPKLPTN-MPFLPPNMPILPPHLQhmpnvlphlqnMPnVPPHLASFPNMINLPNLP-PHMHNLPPNMH 191
Cdd:smart00818   40 IPVSQQH--PPTHTLQPHHhIPVLPAQQPVVPQQPL-----------MP-VPGQHSMTPTQHHQPNLPqPAQQPFQPQPL 105
                            90       100
                    ....*....|....*....|....*....
gi 124505249    192 SLPPHMHNLPP----NMHSLPPNMNYIPP 216
Cdd:smart00818  106 QPPQPQQPMQPqppvHPIPPLPPQPPLPP 134
JAB_MPN smart00232
JAB/MPN domain; Domain in Jun kinase activation domain binding protein and proteasomal ...
2840-2947 4.29e-07

JAB/MPN domain; Domain in Jun kinase activation domain binding protein and proteasomal subunits. Domain at Mpr1p and Pad1p N-termini. Domain of unknown function.


Pssm-ID: 214573  Cd Length: 135  Bit Score: 51.61  E-value: 4.29e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249   2840 YTYVIAKNLLEKFIciSDLKIQVGGFLFGSSPEDNSYVKEIKCILIPPQIG-NYQSVTLSSYMPSSKYLQ---NLELLGW 2915
Cdd:smart00232    4 VHPLVPLNILKHAI--RDGPEEVCGVLLGKSNKDRPEVKEVFAVPNEPQDDsVQEYDEDYSHLMDEELKKvnkDLEIVGW 81
                            90       100       110
                    ....*....|....*....|....*....|..
gi 124505249   2916 IHTQTtncsNTNNHLTAYDMVAHFNFLQECKR 2947
Cdd:smart00232   82 YHSHP----DESPFPSEVDVATHESYQAPWPI 109
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
100-229 5.51e-06

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 464609 [Multi-domain]  Cd Length: 325  Bit Score: 51.35  E-value: 5.51e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249   100 PQNVPNGFINNIGNIPYNN----MNAFPPNMPKLPTNMPFLPPNMPILPPHLQHMPNVLPHlQNMPNVPPH-LASFPNMI 174
Cdd:pfam15279  160 PPGSPPMSMTPRGLLGKPQqhppPSPLPAFMEPSSMPPPFLRPPPSIPQPNSPLSNPMLPG-IGPPPKPPRnLGPPSNPM 238
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 124505249   175 NLPNLPPHMHNLPPNMHSLPPHMHNLPPNMHSlPPNMNYIPPGinnympNMMNMP 229
Cdd:pfam15279  239 HRPPFSPHHPPPPPTPPGPPPGLPPPPPRGFT-PPFGPPFPPV------NMMPNP 286
KLF1_2_4_N-like cd22056
N-terminal domain of Kruppel-like factors with similarity to the N-terminal domains of ...
146-217 1.80e-03

N-terminal domain of Kruppel-like factors with similarity to the N-terminal domains of Kruppel-like factor (KLF)1, KLF2, and KLF4; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domains of an unknown subfamily of KLFs, predominantly found in fish, related to the N-terminal domains of KLF1, KLF2, and KLF4.


Pssm-ID: 409231 [Multi-domain]  Cd Length: 339  Bit Score: 43.49  E-value: 1.80e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 124505249  146 HLQHMPNVLPHLQNMPNVPPHLASFPNminlPNLPPHMHNLPPNMHSLPPHMHNLPPNMHSlPPNMNYIPPG 217
Cdd:cd22056   209 PKHQMHSVHPQAFTHHQAAGPGALQGR----GGRGGPDCHLLHSSHHHHHHHHLQYQYMNA-PYPPHYAHQG 275
 
Name Accession Description Interval E-value
PRP8 COG5178
U5 snRNP spliceosome subunit [RNA processing and modification];
400-3135 0e+00

U5 snRNP spliceosome subunit [RNA processing and modification];


Pssm-ID: 227505 [Multi-domain]  Cd Length: 2365  Bit Score: 3069.56  E-value: 0e+00
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  400 VVEEKEEMPCEHLRKIVKEHGDMSNKKYRYDKRVYLGALKYIPHAVFKLLENIPMPWEQIKNTKVIYHITGAITFVNETF 479
Cdd:COG5178    77 VLTLKAPIPPEHLRKIQSPCSDMPSVLTKVDKRSYLGALKYLPHAVLKLLENMPSPWEDVSEVKVLYHCHGAITFVNEVP 156
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  480 VVIDPLYIAQWGTMWIMMRREKRDRKHFKRMRFPPFDDEEPPLDYADNILDIEPLECIQMKLDKDEDKSVIDWFYDSKPL 559
Cdd:COG5178   157 RVIEPQLFAQWGLCWSPMRREKRDRYSFKRKRFPPFDDLEPPLSKSQWVLGVEPLMPINIRLDRMDDEHVRDWVYTSRDL 236
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  560 lYNRNHIPGTSYKKYKLSLEQMGVLYRLGNQLFSDFQDDNYFYLFNLKSFYTAKALNMAIPGGPKFEPLYRDIYEDDEDW 639
Cdd:COG5178   237 -EDHPSVNGAMYRRWKYMLPAMHNLLRLMPMLWESIRDVNYVYLFSGLSFFVAKALNVAIPGGPKFEPLYSRESAEFEDE 315
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  640 NEFNDINKIIIRQQIRTEYKIAFPYLYNNRPRKIAVSKYHSPMCVYIKL-EDIDLPPFYFDLIINPIPSYKIrkfnksse 718
Cdd:COG5178   316 NEFNGIVRIIRRPPIDDEYPVAFPGLYNSRPRSVAVECYGSPECRDVFLdEDEDYPANFKDPLINPILGVQL-------- 387
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  719 kkdselfdddfyltytrkeiyyydhgdddkkkkstsksrkhskhsdaDDNRYDkgyrkyrkssssyksfkrdkrksTNSS 798
Cdd:COG5178   388 -----------------------------------------------DNHPYD-----------------------GKGS 397
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  799 NDKDIDEEDYNSGVSSIDnndnsdtyissskynsnnmssrtsknkdetyeidstvendshdgslkkeknkkkrknpyndd 878
Cdd:COG5178   398 NEESCVMERKLFSEPIFY-------------------------------------------------------------- 415
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  879 nykgddknksddddnykgddnndnnnkyksdnissckknkkmiikhveygilPLLHNYPLYTERTINGIQLYHAPYPFNK 958
Cdd:COG5178   416 ----------------------------------------------------PYLYNESTEVRTTERAHLLLKNPFPFNK 443
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  959 KCGYTRRGIDIPLVQSWFKEHISTKYPVKVRVSYQKLLKCWVLNHLHSKRPKSMKKKYLFRIFKSTKFFQCTEMDWVEVG 1038
Cdd:COG5178   444 GKGRAERAQDVPLDKPWLLGHCLQERPVKVPVSYQKLLKNYVRNMLHKTRPRPHTNTHLLKELKNTKYFQRTEIDWVEAG 523
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1039 LQVCRQGYNMLNLLIHRKNLNYLHLDYNFNLKPVKTLTTKERKKSRFGNAFHLCREILRLTKSIVDSHVQYRLGNIDAYQ 1118
Cdd:COG5178   524 LQLCRQGHNMLSLLIHRKGLTYLHLDYNFNLKPTKTLTTKERKKSRVGNSFHLMREMLKFIKLIVDIHVQFRLGNIDAYQ 603
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1119 LADGIQYIFAHVGQLTGMYRYKYRLMRQVRMCKDLKHLIYYRFNTGsVGKGPGCGLWAPLWRVWIFFLRGVIPLLERWLS 1198
Cdd:COG5178   604 LADGVHYILNHVGQLTGIYRYKYKLMKQIRACKDWKHLIYYAFNEG-VGKGPGCGFWGPQWRVWLFFLRGHIPLLERYIG 682
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1199 NLLARQFEGRvSKGIAKTVTKQRVESHFDLELRAAVMHDIIDMIPEGLKnnKGKARLILQHLSEAWRCWKANIPWKVVGL 1278
Cdd:COG5178   683 NLVTRQFEGR-SDYNPKPLTKQRSDSGYDLELRRQVMADILSMLPEGIR--QTKVRTILQHLSEAWRCWKANIPWHVPGE 759
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1279 PLPVENIIIRYIKLKADWWVNATYYNRERIKRGATVDKTVCKKNLGRLTRLWLKAEQERQHEYLKDGPYVSGEEAVALYT 1358
Cdd:COG5178   760 PAPILEVIRRYIKSKADLWTSSAHFNRERISRGAGVGKTKEKKNLGRLTRLWVKLEQERQVDSAKVGPKSTKEEAKRIGK 839
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1359 TAIHWFESRKFTHIPFPPLNYKHDTKLLILALEKLKETFTVKNRLNQSQREELGFIEQAYDNPYETLSRIKRHLLTQRAF 1438
Cdd:COG5178   840 ITVLWLESRMFEPIPFPPLRYKEDTKILVLALEYLKSKYTGKIRLNESTREELALLEKAYDNPHDTLFRIKKSLLTQRSF 919
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1439 KEISISFLDLYTHLVPVYEVDPLEKITDAYLDQYLWYEGDLRNLFPNWVKPSDNEPQPLLVYKMCQGINNLHNIWDTKNN 1518
Cdd:COG5178   920 KEVGITLMRHYDGAIPVYSVDPVEKIVDAYLDQYLWYEADRRNLFPEWIKPSDSEMPPLLVYKWCQGINNLKAAWDTSNG 999
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1519 ECVVMLQTQFSKIYEKIDLTLLNRLLRLIVDHNIADYITAKNNTNITFKDMNHINSFGIIRGLQFSSFVFQYYTIIIDLL 1598
Cdd:COG5178  1000 ERLVLYETKLEGIMEKVDNTLLNRLLKLVLDPNLADYIIAKNNVVVVYKDMSHTNHYGLIRGLQFSSFIYQFYGLVVDLL 1079
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1599 ILGLTRAYDIAGPYNDVNQFLTFQNVQIETRHPIRLYCRYVDKIWILFKFTNEESKDLIQKFLTENPDPNNENIVGYNN- 1677
Cdd:COG5178  1080 VLGLQRATEIAGPADAPNVFMDFKSRATETSHPIRLYTRYMDDIYIVFRFQRKEEDSLLEDYLRENPDPEEANNERYRNy 1159
                        1290      1300      1310      1320      1330      1340      1350      1360
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1678 -KTCWPRDCRMRKMKHDVNLGRATFWEIQNRIPRSLTSLDWDhyNTFVSVYSKDNPNLLFSIAGFEVRILPKIRQlsygy 1756
Cdd:COG5178  1160 fKGCWPDDCRMRLGPLDVNLGRAVFWEILRRCPHSLTATRWE--PSFGSVYSKINPNLLFSMVGFEVRILPKIRK----- 1232
                        1370      1380      1390      1400      1410      1420      1430      1440
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1757 ngimytsymneyprgvgtkdetskkngllhddekskkvgslkdevtkgkshvdkNEENsdnnkndnkndsthanthdmvg 1836
Cdd:COG5178  1233 ------------------------------------------------------IEER---------------------- 1236
                        1450      1460      1470      1480      1490      1500      1510      1520
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1837 dnnydggvknnfynssggeknvvvssSVKEGTWKLQNEMTKEITAEAYLKVSDNSMKRFENRVRQILMSSGSTTFTKIAN 1916
Cdd:COG5178  1237 --------------------------SLSSGVWRLGDGRTKQRTAHANLAVSEGGIEMFESRIRHILMTSGSTTFTKVAT 1290
                        1530      1540      1550      1560      1570      1580      1590      1600
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1917 KWNTTLIGLMTYFREAVLDTEELLDLLVKCENKIQTRIKIGLNSKMPSRFPPVVFYTPKELGGLGMLSMGHILIPESDLR 1996
Cdd:COG5178  1291 KWNTQLIALVTYYREAICDTKGLLDKLVKAERLIQNRVKKGLNSKMPVRFPPAVFYAPKELGGLGMLSVGHILIPHSDLE 1370
                        1610      1620      1630      1640      1650      1660      1670      1680
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 1997 YMKQTDNGrITHFRSGLSHEEDQLIPNLYRYISTWESEFLESQRVWCEYALKRNECHNQNKKITLEDLEDSWDKGIPRIN 2076
Cdd:COG5178  1371 WSKQTDTG-ITHFRSGMTTNGERLIPAAMRYISRWEYEFEDSQRVWAEYARKRQEAGQQNRRLTLEDLEMSWDRGIPRIS 1449
                        1690      1700      1710      1720      1730      1740      1750      1760
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2077 TLFQKDRHTLAYDKGWRIRQLFKQYQIIKSNPFWWTNQRHDGKLWNLNNYRTDMIQALGGVEGILEHTLFKGTFFPTWEG 2156
Cdd:COG5178  1450 TLFQRDRHTLAYDRGFRMRSEFKQYSLKPNNPFWWTDAKHDGKLWSLNRYRLDVIQALGGVEGILEHTLFKATGFRSWEG 1529
                        1770      1780      1790      1800      1810      1820      1830      1840
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2157 LFWEKASGFEESMKYKKLTNAQRSGLNQIPNRRFTLWWSPTINRANVYVGFQVQLDLTGIFMHGKIPTLKISLIQIFRAH 2236
Cdd:COG5178  1530 LFWEKASGFEESMKFKKLTNAQRMGLSQIPNRRFTLWWSPTINRANVYVGFQVQLDLTGILMHGKIPTLKISLIQIFRNH 1609
                        1850      1860      1870      1880      1890      1900      1910      1920
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2237 LWQKIHESLVMDICQVFDLNCDLLDIETVQKETIHPRKSYKMNSSCADILLFANYKWGISKPSLLTDEDHIFTNntlgst 2316
Cdd:COG5178  1610 LWQKIHESVVGDLCQVLDKELDVLQIETVQKETVHPRKSYKMNSSCADILLSGAYDWCVSSPSLLLEERDGGSN------ 1683
                        1930      1940      1950      1960      1970      1980      1990      2000
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2317 sgtnnnimlnsnminsgsnnsssnnmnsvsfgsfpYTSNQFWIDIQLRWGDFDSHDIERYSRAKFLDYTTDNLSIYPCLT 2396
Cdd:COG5178  1684 -----------------------------------VRTNKLWIDVQLRWGDYDSHDIHRYARAKFLDYTTDPQSMYPSPT 1728
                        2010      2020      2030      2040      2050      2060      2070      2080
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2397 GVLIGVDLAYNLYSAYGNWFNNLKPLMQKALQKIVQSNPSLYVLRERIRKGLQLYSSEPTEPYLNTQNYNELFSSQTIWF 2476
Cdd:COG5178  1729 GVVIGIDLCYNMWSAYGNWNEGLKPLIQSSMERIMKANPALYVLRERIRKGLQLYTSEPQEQYLSSSNYAELFSNSIDLF 1808
                        2090      2100      2110      2120      2130      2140      2150      2160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2477 VDDTNVYRVTIHKTFEGNLTTKPINGAIFILNPKTGQLFLKIIHTSVWIGQKRLSQLAKWKTAEEVASLIRSLPIEEQPK 2556
Cdd:COG5178  1809 VDDTNVYRVTLHKTFEGNLTTKPINGAIFVLNPATGNLFLKVIHTSVWAGQKRLIQLAKWKTAEEVFALGRSLPVEEQPK 1888
                        2170      2180      2190      2200      2210      2220      2230      2240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2557 QIIVTRKGMLDPLEVHLLDFPNIIIKGTELNLPFQALLKLNKIGDLILKATQPQMLLFNLYDDWLNSISSFTAFSRLILI 2636
Cdd:COG5178  1889 QIIVTRKSMLDPLEVHILDFPNISIRTCELALPFSAVMGIDKIRDLILRATEPQMVLFNLYDDWLQETSSYTAFSRLLLV 1968
                        2250      2260      2270      2280      2290      2300      2310      2320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2637 LRSLHINPQQTKILLQPNKNIvTTQPHHIWPSFNNNQWIHLEVQLKDLILNDYSKRNNVHIASLTQNEIRDILLGMEITP 2716
Cdd:COG5178  1969 LRALDVNEERVKEILRPDKSI-ITKINHLWPGFSDSQWIKKEIQLRDLILDRYCSKHNINPSGLTQSEVRDIILGFRISA 2047
                        2330      2340      2350      2360      2370      2380      2390      2400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2717 PSIQRQQIAELEKNNLDLMEQQMKVTTSKTTTKHGNEIIVSTLSPHEQQTFTTKTDWKIRYLANNSLLFRTKNIYVnnnn 2796
Cdd:COG5178  2048 PSGARQETAETEKQNSEKALSRPTNVSTKTINGWGREYVVLDGMIYEGEKFSSKEEWRSEAIRTGPLELRTKNIYV---- 2123
                        2410      2420      2430      2440      2450      2460      2470      2480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2797 msnmsnintiSASASSHNIlnkngtnsdnqnshyhtsinsinDYTYVIAKNLLEKFICISDLKIQVGGFLFGSSPEDNSY 2876
Cdd:COG5178  2124 ----------TADENEESI-----------------------QQMYRLPLNLLEKFMRISDPHVQVAGLVYGKSGSDNPQ 2170
                        2490      2500      2510      2520      2530      2540      2550      2560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2877 VKEIKCILIPPQIGNYQSVTLSSYMPSSK-YLQNLELLGWIHTQTTNCSntnnHLTAYDMVAHfnflqeckrqmskGKKV 2955
Cdd:COG5178  2171 IKEILSFGLVPQLGSLSGVQSSSFVPHDLpGDEDLEILGWIHTQDDELP----YLEVAGVLTH-------------RKKI 2233
                        2570      2580      2590      2600      2610      2620      2630      2640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2956 ADashndddvddyddddynnneddynnnneddninnnseggtkrdetyKMWDknkTIILTCSFTPGSCTINAYKLTSDGY 3035
Cdd:COG5178  2234 VD----------------------------------------------PEWD---AVTLTVSYLPGSISLRAYVVKKEGC 2264
                        2650      2660      2670      2680      2690      2700      2710      2720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 3036 SFakskkNSSDLYVFPNVNNLYEPV-----QILLSNVFVGYFLIPDDHIWNYNLMGIKFNNNQKYAPHLDIPQPFYADIH 3110
Cdd:COG5178  2265 NW-----GSKNMDINSDEAIGVEPVlgkdcQLLLSDRIQGVFYVPEEEVWNYNFAGPFFDDRLEYTWKIGMPLGFYDGFH 2339
                        2730      2740
                  ....*....|....*....|....*
gi 124505249 3111 RPNHFLQFSLLDQRDADEADVETSF 3135
Cdd:COG5178  2340 RPGHFSRFYELRAGGRLEEWQEDAF 2364
PROCN pfam08083
PROCN (NUC071) domain; The PROCN domain is the central domain in pre-mRNA splicing factors of ...
940-1343 0e+00

PROCN (NUC071) domain; The PROCN domain is the central domain in pre-mRNA splicing factors of PRO8 family.


Pssm-ID: 400431  Cd Length: 402  Bit Score: 832.42  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249   940 TERTINGIQLYHAPYPFNKKCGYTRRGIDIPLVQSWFKEHISTKYPVKVRVSYQKLLKCWVLNHLHSKRPKSMKKKYLFR 1019
Cdd:pfam08083    1 NENTKDGISLYWAPYPFNRRSGRTKRAQDVPLIKHWYREHPPSNYPVKVRVSYQKLLKNYVLNELHHRKQKKHKKKNLLK 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1020 IFKSTKFFQCTEMDWVEVGLQVCRQGYNMLNLLIHRKNLNYLHLDYNFNLKPVKTLTTKERKKSRFGNAFHLCREILRLT 1099
Cdd:pfam08083   81 SLKNTKFFQQTTIDWVEAGLQVCRQGHNMLNLLIHRKGLTYLHLDYNFNLKPTKTLTTKERKKSRFGNAFHLIRELLRLT 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1100 KSIVDSHVQYRLGNIDAYQLADGIQYIFAHVGQLTGMYRYKYRLMRQVRMCKDLKHLIYYRFNTGSVGKGPGCGLWAPLW 1179
Cdd:pfam08083  161 KLLVDAHVQYRLGNIDAYQLADGLQYIFNHVGQLTGMYRYKYKVMRQIRACKDLKHVIYSRFNTGEVGKGPGCGFWAPSW 240
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1180 RVWIFFLRGVIPLLERWLSNLLARQFEGRVSKGIAKTVTKQRVESHFDLELRAAVMHDIIDMIPEGLKNNkgKARLILQH 1259
Cdd:pfam08083  241 RVWIFFLRGIIPLLERWLGNLLARQFEGRKSKDVAKTITKQRVESYYDLELRASVMHDILDMMPEGVKQN--KARTILQH 318
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1260 LSEAWRCWKANIPWKVVGLPLPVENIIIRYIKLKADWWVNATYYNRERIKRGATVDKTVCKKNLGRLTRLWLKAEQERQH 1339
Cdd:pfam08083  319 LSEAWRCWKANIPWKVPGLPEPIENIILRYVKAKADWWTSVAHYNRERIKRGATVDKTVAKKNLGRLTRLWLKAEQERQA 398

                   ....
gi 124505249  1340 EYLK 1343
Cdd:pfam08083  399 NYLK 402
RNase_H_like_Prp8_IV cd13838
Ribonuclease-like Prp8 domain IV core; This family contains Prp8 domain IV, which adopts a ...
2463-2714 2.25e-168

Ribonuclease-like Prp8 domain IV core; This family contains Prp8 domain IV, which adopts a RNase H like fold within its core structure but with little sequence similarity. Prp8, a spliceosome protein, interacts directly with the splice sites and branch regions of precursor-mRNAs and spliceosomal RNAs associated with catalysis of the two steps of splicing. Catalysis of RNA cleavage by RNase H-like proteins involves a two-metal mechanism in which adjacently-bound divalent magnesium ions promote hydrolysis by activation of a water nucleophile and stabilization of the transition-state. However, the Prp8 domain IV contains only one of the canonical metal-binding sites and the coordinating side chains are spatially conserved with respect to Mg2+-coordinating residues within the RNase H fold.


Pssm-ID: 260013  Cd Length: 251  Bit Score: 518.44  E-value: 2.25e-168
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2463 QNYNELFSSQTIWFVDDTNVYRVTIHKTFEGNLTTKPINGAIFILNPKTGQLFLKIIHTSVWIGQKRLSQLAKWKTAEEV 2542
Cdd:cd13838     1 QNYGELFSNQIIWFVDDTNVYRVTIHKTFEGNLTTKPINGAIFIFNPRTGQLFLKIIHTSVWAGQKRLGQLAKWKTAEEV 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2543 ASLIRSLPIEEQPKQIIVTRKGMLDPLEVHLLDFPNIIIKGTELNLPFQALLKLNKIGDLILKATQPQMLLFNLYDDWLN 2622
Cdd:cd13838    81 AALIRSLPVEEQPKQIIVTRKGMLDPLEVHLLDFPNIVIKGSELQLPFQACLKIEKFGDLILKATEPQMVLFNLYDDWLK 160
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2623 SISSFTAFSRLILILRSLHINPQQTKILLQPNKNIVtTQPHHIWPSFNNNQWIHLEVQLKDLILNDYSKRNNVHIASLTQ 2702
Cdd:cd13838   161 TISSYTAFSRLILILRALHVNNEKAKIILKPDKTVI-TEPHHIWPTLSDEEWIKVEVQLKDLILADYGKKNNVNVASLTQ 239
                         250
                  ....*....|..
gi 124505249 2703 NEIRDILLGMEI 2714
Cdd:cd13838   240 SEIRDIILGMEI 251
PRP8_domainIV pfam12134
PRP8 domain IV core; This domain is found in eukaryotes, and is about 20 amino acids in length. ...
2457-2687 2.26e-151

PRP8 domain IV core; This domain is found in eukaryotes, and is about 20 amino acids in length. It is found associated with pfam10597, pfam10596, pfam10598, pfam08083, pfam08082, pfam01398, pfam08084. There is a conserved LILR sequence motif. The domain is a selenomethionine domain in a subunit of the spliceosome. The function of PRP8 domain IV is believed to be interaction with the splicosomal core.


Pssm-ID: 432353  Cd Length: 230  Bit Score: 468.77  E-value: 2.26e-151
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  2457 EPYLNTQNYNELFSSQTIWFVDDTNVYRVTIHKTFEGNLTTKPINGAIFILNPKTGQLFLKIIHTSVWIGQKRLSQLAKW 2536
Cdd:pfam12134    1 EPFLNSQNYAELFSNQTQWFVDDTNVYRVTVHKTFEGNLTTKPVNGAVFILNPKTGQLFLKIIHTSVWAGQKRLSQLAKW 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  2537 KTAEEVASLIRSLPIEEQPKQIIVTRKGMLDPLEVHLLDFPNIIIKGTELNLPFQALLKLNKIGDLILKATQPQMLLFNL 2616
Cdd:pfam12134   81 KTAEEVAALVRSLPPEEQPKQIIVTRKGMLDPLEVHLLDFPNIVIRGSELHLPFPAALKIEKLGDLVLKATEPQMVLFNI 160
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 124505249  2617 YDDWLNSISSFTAFSRLILILRSLHINPQQTKILLQPNKNIVtTQPHHIWPSFNNNQWIHLEVQLKDLILN 2687
Cdd:pfam12134  161 YDDWLKSISPYTAFSRLILILRALHINPEKTKMILRPDPTVV-TKPHHLWPTLSDQQWIEVEIQLRDLILS 230
PRO8NT pfam08082
PRO8NT (NUC069), PrP8 N-terminal domain; The PRO8NT domain is found at the N-terminus of ...
404-555 6.21e-110

PRO8NT (NUC069), PrP8 N-terminal domain; The PRO8NT domain is found at the N-terminus of pre-mRNA splicing factors of PRO8 family. The NLS or nuclear localization signal for these spliceosome proteins begins at the start and runs for 60 residues. N-terminal to this domain is a highly variable proline-rich region.


Pssm-ID: 462361 [Multi-domain]  Cd Length: 152  Bit Score: 346.59  E-value: 6.21e-110
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249   404 KEEMPCEHLRKIVKEHGDMSNKKYRYDKRVYLGALKYIPHAVFKLLENIPMPWEQIKNTKVIYHITGAITFVNETFVVID 483
Cdd:pfam08082    1 KEDMPPEHLRKIIKDHGDMSSKKFRSDKRSYLGALKYMPHAVLKLLENMPMPWEEVREVKVLYHVTGAITFVNEIPRVIE 80
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 124505249   484 PLYIAQWGTMWIMMRREKRDRKHFKRMRFPPFDDEEPPLDYADNILDIEPLECIQMKLDKDEDKSVIDWFYD 555
Cdd:pfam08082   81 PVYIAQWSTMWIMMRREKRDRRHFKRMRFPPFDDEEPPLDYADNILDVEPLEAIQMELDEEEDAAVIDWFYD 152
U6-snRNA_bdg pfam10596
U6-snRNA interacting domain of PrP8; This domain incorporates the interacting site for the ...
2098-2254 5.18e-106

U6-snRNA interacting domain of PrP8; This domain incorporates the interacting site for the U6-snRNA as part of the U4/U6.U5 tri-snRNPs complex of the spliceosome, and is the prime candidate for the role of cofactor for the spliceosome's RNA core. The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor.


Pssm-ID: 431383  Cd Length: 159  Bit Score: 335.61  E-value: 5.18e-106
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  2098 FKQYQIIKSNPFWWTNQRHDGKLWNLNNYRTDMIQALGGVEGILEHTLFKGTFFPTWEGLFWEKASGFEESMKYKKLTNA 2177
Cdd:pfam10596    1 FKKYQLLKYNPFWWTHARHDGKLWNLERYRTDIIQALGGVEGILEHTLFKATGFPSWEGLFWDKASGFEESLKYKKLTNA 80
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 124505249  2178 QRSGLNQIPNRRFTLWWSPTINRANVYVGFQVQLDLTGIFMHGKIPTLKISLIQIFRAHLWQKIHESLVMDICQVFD 2254
Cdd:pfam10596   81 QRQGLSQIPNRRFTLWWSPTINRANVYVGFQVQLDLTGVFMHGKLPTLKISLIQIFRGHLWQKIHESVVMDLCQVLD 157
MPN_PRP8 cd08056
Mpr1p, Pad1p N-terminal (MPN) domains without isopeptidase activity found in splicing factor ...
2767-3118 1.39e-105

Mpr1p, Pad1p N-terminal (MPN) domains without isopeptidase activity found in splicing factor Prp8; Members of this family are found in pre-mRNA-processing factor 8 (Prp8) which is a critical splicing factor, interacting with several other spliceosomal proteins, snRNAs, and the pre-mRNA, thus organizing and stabilizing the spliceosome catalytic core. Prp8 is one of the largest and most highly conserved of nuclear proteins, occupying a central position in the catalytic core of the spliceosome. Its C-terminal domain exhibits a JAB1/MPN-like core similar to deubiquitinating enzymes, but does not show catalytic isopeptidase activity, possibly because the putative isopeptidase center is covered by insertions and terminal appendices that are grafted onto this core, thus impairing the metal binding site. It is proposed that this domain is a protein interaction domain instead of a Zn(2+)-dependent metalloenzyme as proposed for some MPN proteins. The DEAD-box protein Brr2 and the GTPase Snu114 bind to the Prp8 C-terminus, a region where mutations in human Prp8 (hPrp8) cause a severe form of the genetic disorder retinitis pigmentosa, RP13, which leads to progressive photoreceptor degeneration in the retina and eventual blindness. At the N-terminus of Prp8, there are several domains, including a highly variable nuclear localization signal (NLS) motif rich in prolines, a conserved RNA recognition motif (RRM), and U5 and U6 snRNA binding sites.


Pssm-ID: 163687 [Multi-domain]  Cd Length: 252  Bit Score: 338.45  E-value: 1.39e-105
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2767 FTTKTDWKIRYLANNSLLFRTKNIYVnnnnmsnmsnintisasaSSHNILNkngtnsdnqnshyhtsinsiNDYTYVIAK 2846
Cdd:cd08056     1 FSSKTDWRVRAIAATNLHLRTKNIYV------------------SSDDIKE--------------------TGYTYILPK 42
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2847 NLLEKFICISDLKIQVGGFLFGSSPEDNSYVKEIKCILIPPQIGNYQSVTLSSYMPSSKYLQNLELLGWIHTQttncSNT 2926
Cdd:cd08056    43 NLLKKFISISDLRTQIAGYLYGKSPPDNPQVKEIRCIVLVPQLGTHQTVTLPQQLPQHEYLEDLEPLGWIHTQ----PNE 118
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 2927 NNHLTAYDMVAHFNFLQEckrqmskgkkvadashndddvddyddddynnneddynnnneddninnnseggtkrdetYKMW 3006
Cdd:cd08056   119 LPQLSPQDVTTHAKILAD----------------------------------------------------------NPSW 140
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249 3007 DKNKTIILTCSFTPGSCTINAYKLTSDGYSFAKSKKNSSDLYVFPNVNNLYEPVQILLSNVFVGYFLIPDDHIWNYNLMG 3086
Cdd:cd08056   141 DGEKTVILTCSFTPGSCSLTAYKLTPEGYEWGKQNKDLGNNTPKGYSPSFYEKVQLLLSDRFLGFFLVPEDGVWNYNFMG 220
                         330       340       350
                  ....*....|....*....|....*....|..
gi 124505249 3087 IKFNNNQKYAPHLDIPQPFYADIHRPNHFLQF 3118
Cdd:cd08056   221 AKHSPNMKYDLKLDIPKEFYHELHRPTHFLQF 252
U5_2-snRNA_bdg pfam10597
U5-snRNA binding site 2 of PrP8; The essential spliceosomal protein Prp8 interacts with U5 and ...
1865-1997 1.10e-77

U5-snRNA binding site 2 of PrP8; The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor.


Pssm-ID: 402297  Cd Length: 134  Bit Score: 253.46  E-value: 1.10e-77
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1865 KEGTWKLQNEMTKEITAEAYLKVSDNSMKRFENRVRQILMSSGSTTFTKIANKWNTTLIGLMTYFREAVLDTEELLDLLV 1944
Cdd:pfam10597    1 NEGVWDLINESTKERTAKAFLQVSEESINNFRNRIRQILMSSGSTTFTKIANKWNTALIALFTYFREAIVSTEELLDILV 80
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|...
gi 124505249  1945 KCENKIQTRIKIGLNSKMPSRFPPVVFYTPKELGGLGMLSMGHILIPESDLRY 1997
Cdd:pfam10597   81 KCETRVQNRVKLGLNSKMPSRFPPAVFYTPKELGGLGMLSAGHVLIPASDLRW 133
PROCT pfam08084
PROCT (NUC072) domain; The PROCT domain is the C-terminal domain in pre-mRNA splicing factors ...
3012-3122 4.40e-42

PROCT (NUC072) domain; The PROCT domain is the C-terminal domain in pre-mRNA splicing factors of PRO8 family.


Pssm-ID: 462362  Cd Length: 111  Bit Score: 150.77  E-value: 4.40e-42
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  3012 IILTCSFTPGSCTINAYKLTSDGYSFAKSKKNSSDLYVFPNVNNLYEPVQILLSNVFVGYFLIPDDHIWNYNLMGIKFNN 3091
Cdd:pfam08084    1 ITLTVSFTPGSVSLSAYTLTPEGYEWGRQNKDLISDNPQGFSPSFSEKVQLLLSDRILGFFLVPEDGVWNYSFMGASFNP 80
                           90       100       110
                   ....*....|....*....|....*....|.
gi 124505249  3092 NQKYAPHLDIPQPFYADIHRPNHFLQFSLLD 3122
Cdd:pfam08084   81 NMKYDLKLDIPLEFYDELHRPTHFLNFNELE 111
RRM_4 pfam10598
RNA recognition motif of the spliceosomal PrP8; The large RNA-protein complex of the ...
1533-1624 3.57e-33

RNA recognition motif of the spliceosomal PrP8; The large RNA-protein complex of the spliceosome catalyzes pre-mRNA splicing. One of the most conserved core proteins is PrP8 which occupies a central position in the catalytic core of the spliceosome, and has been implicated in several crucial molecular rearrangements that occur there, and has recently come under the spotlight for its role in the inherited human disease, Retinitis Pigmentosa. The RNA-recognition motif of PrP8 is highly conserved and provides a possible RNA binding centre for the 5-prime SS, BP, or 3-prime SS of pre-mRNA which are known to contact with Prp8. The most conserved regions of an RRM are defined as the RNP1 and RNP2 sequences. Recognition of RNA targets can also be modulated by a number of other factors, most notably the two loops beta1-alpha1, beta2-beta3 and the amino acid residues C-terminal to the RNP2 domain.


Pssm-ID: 463163  Cd Length: 92  Bit Score: 124.66  E-value: 3.57e-33
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  1533 EKIDLTLLNRLLRLIVDHNIADYITAKNNTNITFKDMNHINSFGIIRGLQFSSFVFQYYTIIIDLLILGLTRAYDIAGPY 1612
Cdd:pfam10598    1 EKIDLTLLNRLLRLIMDHNLADYITAKNNVVLTFKDMSHVNSYGLIRGLQFSSFVYQYYGLVLDLLILGLQRASEIAGPP 80
                           90
                   ....*....|..
gi 124505249  1613 NDVNQFLTFQNV 1624
Cdd:pfam10598   81 QMPNEFLQFKDI 92
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
114-216 4.53e-08

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 55.18  E-value: 4.53e-08
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249    114 IPYNNMNafPPNMPKLPTN-MPFLPPNMPILPPHLQhmpnvlphlqnMPnVPPHLASFPNMINLPNLP-PHMHNLPPNMH 191
Cdd:smart00818   40 IPVSQQH--PPTHTLQPHHhIPVLPAQQPVVPQQPL-----------MP-VPGQHSMTPTQHHQPNLPqPAQQPFQPQPL 105
                            90       100
                    ....*....|....*....|....*....
gi 124505249    192 SLPPHMHNLPP----NMHSLPPNMNYIPP 216
Cdd:smart00818  106 QPPQPQQPMQPqppvHPIPPLPPQPPLPP 134
JAB_MPN smart00232
JAB/MPN domain; Domain in Jun kinase activation domain binding protein and proteasomal ...
2840-2947 4.29e-07

JAB/MPN domain; Domain in Jun kinase activation domain binding protein and proteasomal subunits. Domain at Mpr1p and Pad1p N-termini. Domain of unknown function.


Pssm-ID: 214573  Cd Length: 135  Bit Score: 51.61  E-value: 4.29e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249   2840 YTYVIAKNLLEKFIciSDLKIQVGGFLFGSSPEDNSYVKEIKCILIPPQIG-NYQSVTLSSYMPSSKYLQ---NLELLGW 2915
Cdd:smart00232    4 VHPLVPLNILKHAI--RDGPEEVCGVLLGKSNKDRPEVKEVFAVPNEPQDDsVQEYDEDYSHLMDEELKKvnkDLEIVGW 81
                            90       100       110
                    ....*....|....*....|....*....|..
gi 124505249   2916 IHTQTtncsNTNNHLTAYDMVAHFNFLQECKR 2947
Cdd:smart00232   82 YHSHP----DESPFPSEVDVATHESYQAPWPI 109
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
100-229 5.51e-06

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 464609 [Multi-domain]  Cd Length: 325  Bit Score: 51.35  E-value: 5.51e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249   100 PQNVPNGFINNIGNIPYNN----MNAFPPNMPKLPTNMPFLPPNMPILPPHLQHMPNVLPHlQNMPNVPPH-LASFPNMI 174
Cdd:pfam15279  160 PPGSPPMSMTPRGLLGKPQqhppPSPLPAFMEPSSMPPPFLRPPPSIPQPNSPLSNPMLPG-IGPPPKPPRnLGPPSNPM 238
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 124505249   175 NLPNLPPHMHNLPPNMHSLPPHMHNLPPNMHSlPPNMNYIPPGinnympNMMNMP 229
Cdd:pfam15279  239 HRPPFSPHHPPPPPTPPGPPPGLPPPPPRGFT-PPFGPPFPPV------NMMPNP 286
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
96-210 2.71e-05

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 464609 [Multi-domain]  Cd Length: 325  Bit Score: 49.04  E-value: 2.71e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249    96 INNIPQNVPNGFINNIGNIPYNNMNAFPPNMP-KLPTNMP---FLPPNMPI-LPPHLQHMPNVLPHLqnMPNVPPHLASF 170
Cdd:pfam15279  188 AFMEPSSMPPPFLRPPPSIPQPNSPLSNPMLPgIGPPPKPprnLGPPSNPMhRPPFSPHHPPPPPTP--PGPPPGLPPPP 265
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|..
gi 124505249   171 PNMINLPNLPPhmhnLPP-NMHSLPPHM-HNLPPNMHSLPPN 210
Cdd:pfam15279  266 PRGFTPPFGPP----FPPvNMMPNPPEMnFGLPSLAPLVPPV 303
MPN cd07767
Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) ...
2855-2919 2.95e-04

Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function.


Pssm-ID: 163686 [Multi-domain]  Cd Length: 116  Bit Score: 42.88  E-value: 2.95e-04
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 124505249 2855 ISDLKIQVGGFLFGSSPEDnsyVKEIKCILIPPQIGN--YQSVTLSSYMPSSKYLQNLELLGWIHTQ 2919
Cdd:cd07767     9 KSINGKEVIGLLYGSKTKK---VLDVDEVIAVPFDEGdkDDNVWFLMYLDFKKLNAGLRIVGWYHTH 72
JAB pfam01398
JAB1/Mov34/MPN/PAD-1 ubiquitin protease; Members of this family are found in proteasome ...
2840-2940 1.59e-03

JAB1/Mov34/MPN/PAD-1 ubiquitin protease; Members of this family are found in proteasome regulatory subunits, eukaryotic initiation factor 3 (eIF3) subunits and regulators of transcription factors. This family is also known as the MPN domain and PAD-1-like domain, JABP1 domain or JAMM domain. These are metalloenzymes that function as the ubiquitin isopeptidase/ deubiquitinase in the ubiquitin-based signalling and protein turnover pathways in eukaryotes. Versions of the domain in prokaryotic cognates of the ubiquitin-modification pathway are shown to have a similar role, and the archael protein from Haloferax volcanii is found to cleave ubiquitin-like small archaeal modifier proteins (SAMP1/2) from protein conjugates.


Pssm-ID: 396120  Cd Length: 117  Bit Score: 40.79  E-value: 1.59e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 124505249  2840 YTYVIAKNLLEKFICISD----LKIQVGGFLFGSSPEDnsYVKEIKCILIPPQIGNYQSVTLSS--YMPSSKYL------ 2907
Cdd:pfam01398    4 RTVIIHPLVLLKILDHANrggkIGEEVMGVLLGKLEGD--GTIEITNSFALPQEETEDDVNAVAldQEYMENMHemlkkv 81
                           90       100       110
                   ....*....|....*....|....*....|....
gi 124505249  2908 -QNLELLGWIHTQTTNCSNTNNHLTAYDMVAHFN 2940
Cdd:pfam01398   82 nRKEEVVGWYHTHPGLCWLSSVDVHTHALYQRMI 115
KLF1_2_4_N-like cd22056
N-terminal domain of Kruppel-like factors with similarity to the N-terminal domains of ...
146-217 1.80e-03

N-terminal domain of Kruppel-like factors with similarity to the N-terminal domains of Kruppel-like factor (KLF)1, KLF2, and KLF4; Kruppel/Krueppel-like transcription factors (KLFs) belong to a family of proteins called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specifity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domains of an unknown subfamily of KLFs, predominantly found in fish, related to the N-terminal domains of KLF1, KLF2, and KLF4.


Pssm-ID: 409231 [Multi-domain]  Cd Length: 339  Bit Score: 43.49  E-value: 1.80e-03
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 124505249  146 HLQHMPNVLPHLQNMPNVPPHLASFPNminlPNLPPHMHNLPPNMHSLPPHMHNLPPNMHSlPPNMNYIPPG 217
Cdd:cd22056   209 PKHQMHSVHPQAFTHHQAAGPGALQGR----GGRGGPDCHLLHSSHHHHHHHHLQYQYMNA-PYPPHYAHQG 275
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH