FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE2646, 2249 aa 1>>>pF1KE2646 2249 - 2249 aa - 2249 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 15.2800+/-0.00131; mu= -17.2822+/- 0.079 mean_var=673.1075+/-138.138, 0's: 0 Z-trim(116.6): 128 B-trim: 0 in 0/53 Lambda= 0.049435 statistics sampled from 17060 (17184) to 17060 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.77), E-opt: 0.2 (0.528), width: 16 Scan time: 9.650 The best scores are: opt bits E(32554) CCDS55072.1 ARID1B gene_id:57492|Hs108|chr6 (2249) 15624 1131.1 0 CCDS5251.2 ARID1B gene_id:57492|Hs108|chr6 (2236) 12103 880.0 0 CCDS285.1 ARID1A gene_id:8289|Hs108|chr1 (2285) 3736 283.3 8.7e-75 CCDS44091.1 ARID1A gene_id:8289|Hs108|chr1 (2068) 3359 256.4 1e-66 >>CCDS55072.1 ARID1B gene_id:57492|Hs108|chr6 (2249 aa) initn: 15624 init1: 15624 opt: 15624 Z-score: 6038.3 bits: 1131.1 E(32554): 0 Smith-Waterman score: 15624; 100.0% identity (100.0% similar) in 2249 aa overlap (1-2249:1-2249) 10 20 30 40 50 60 pF1KE2 MAHNAGAAAAAGTHSAKSGGSEAALKEGGSAAALSSSSSSSAAAAAASSSSSSGPGSAME :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 MAHNAGAAAAAGTHSAKSGGSEAALKEGGSAAALSSSSSSSAAAAAASSSSSSGPGSAME 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE2 TGLLPNHKLKTVGEAPAAPPHQQHHHHHHAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 TGLLPNHKLKTVGEAPAAPPHQQHHHHHHAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQ 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE2 QQQQQQQQQQQHPISNNNSLGGAGGGAPQPGPDMEQPQHGGAKDSAAGGQADPPGPPLLS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 QQQQQQQQQQQHPISNNNSLGGAGGGAPQPGPDMEQPQHGGAKDSAAGGQADPPGPPLLS 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE2 KPGDEDDAPPKMGEPAGGRYEHPGLGALGTQQPPVAVPGGGGGPAAVPEFNNYYGSAAPA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 KPGDEDDAPPKMGEPAGGRYEHPGLGALGTQQPPVAVPGGGGGPAAVPEFNNYYGSAAPA 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE2 SGGPGGRAGPCFDQHGGQQSPGMGMMHSASAAAAGAPGSMDPLQNSHEGYPNSQCNHYPG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 SGGPGGRAGPCFDQHGGQQSPGMGMMHSASAAAAGAPGSMDPLQNSHEGYPNSQCNHYPG 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE2 YSRPGAGGGGGGGGGGGGGSGGGGGGGGAGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 YSRPGAGGGGGGGGGGGGGSGGGGGGGGAGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGG 310 320 330 340 350 360 370 380 390 400 410 420 pF1KE2 SSAGYGVLSSPRQQGGGMMMGPGGGGAASLSKAAAGSAAGGFQRFAGQNQHPSGATPTLN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 SSAGYGVLSSPRQQGGGMMMGPGGGGAASLSKAAAGSAAGGFQRFAGQNQHPSGATPTLN 370 380 390 400 410 420 430 440 450 460 470 480 pF1KE2 QLLTSPSPMMRSYGGSYPEYSSPSAPPPPPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 QLLTSPSPMMRSYGGSYPEYSSPSAPPPPPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMG 430 440 450 460 470 480 490 500 510 520 530 540 pF1KE2 AQYAAASPAWAAAQQRSHPAMSPGTPGPTMGRSQGSPMDPMVMKRPQLYGMGSNPHSQPQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 AQYAAASPAWAAAQQRSHPAMSPGTPGPTMGRSQGSPMDPMVMKRPQLYGMGSNPHSQPQ 490 500 510 520 530 540 550 560 570 580 590 600 pF1KE2 QSSPYPGGSYGPPGPQRYPIGIQGRTPGAMAGMQYPQQQDSGDATWKETFWLMPPQYGQQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 QSSPYPGGSYGPPGPQRYPIGIQGRTPGAMAGMQYPQQQDSGDATWKETFWLMPPQYGQQ 550 560 570 580 590 600 610 620 630 640 650 660 pF1KE2 GVSGYCQQGQQPYYSQQPQPPHLPPQAQYLPSQSQQRYQPQQDMSQEGYGTRSQPPLAPG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 GVSGYCQQGQQPYYSQQPQPPHLPPQAQYLPSQSQQRYQPQQDMSQEGYGTRSQPPLAPG 610 620 630 640 650 660 670 680 690 700 710 720 pF1KE2 KPNHEDLNLIQQERPSSLPDLSGSIDDLPTGTEATLSSAVSASGSTSSQGDQSNPAQSPF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 KPNHEDLNLIQQERPSSLPDLSGSIDDLPTGTEATLSSAVSASGSTSSQGDQSNPAQSPF 670 680 690 700 710 720 730 740 750 760 770 780 pF1KE2 SPHASPHLSSIPGGPSPSPVGSPVGSNQSRSGPISPASIPGSQMPPQPPGSQSESSSHPA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 SPHASPHLSSIPGGPSPSPVGSPVGSNQSRSGPISPASIPGSQMPPQPPGSQSESSSHPA 730 740 750 760 770 780 790 800 810 820 830 840 pF1KE2 LSQSPMPQERGFMAGTQRNPQMAQYGPQQTGPSMSPHPSPGGQMHAGISSFQQSNSSGTY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 LSQSPMPQERGFMAGTQRNPQMAQYGPQQTGPSMSPHPSPGGQMHAGISSFQQSNSSGTY 790 800 810 820 830 840 850 860 870 880 890 900 pF1KE2 GPQMSQYGPQGNYSRPPAYSGVPSASYSGPGPGMGISANNQMHGQGPSQPCGAVPLGRMP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 GPQMSQYGPQGNYSRPPAYSGVPSASYSGPGPGMGISANNQMHGQGPSQPCGAVPLGRMP 850 860 870 880 890 900 910 920 930 940 950 960 pF1KE2 SAGMQNRPFPGNMSSMTPSSPGMSQQGGPGMGPPMPTVNRKAQEAAAAVMQAAANSAQSR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 SAGMQNRPFPGNMSSMTPSSPGMSQQGGPGMGPPMPTVNRKAQEAAAAVMQAAANSAQSR 910 920 930 940 950 960 970 980 990 1000 1010 1020 pF1KE2 QGSFPGMNQSGLMASSSPYSQPMNNSSSLMNTQAPPYSMAPAMVNSSAASVGLADMMSPG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 QGSFPGMNQSGLMASSSPYSQPMNNSSSLMNTQAPPYSMAPAMVNSSAASVGLADMMSPG 970 980 990 1000 1010 1020 1030 1040 1050 1060 1070 1080 pF1KE2 ESKLPLPLKADGKEEGTPQPESKSKKSSSSTTTGEKITKVYELGNEPERKLWVDRYLTFM :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 ESKLPLPLKADGKEEGTPQPESKSKKSSSSTTTGEKITKVYELGNEPERKLWVDRYLTFM 1030 1040 1050 1060 1070 1080 1090 1100 1110 1120 1130 1140 pF1KE2 EERGSPVSSLPAVGKKPLDLFRLYVCVKEIGGLAQVNKNKKWRELATNLNVGTSSSAASS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 EERGSPVSSLPAVGKKPLDLFRLYVCVKEIGGLAQVNKNKKWRELATNLNVGTSSSAASS 1090 1100 1110 1120 1130 1140 1150 1160 1170 1180 1190 1200 pF1KE2 LKKQYIQYLFAFECKIERGEEPPPEVFSTGDTKKQPKLQPPSPANSGSLQGPQTPQSTGS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 LKKQYIQYLFAFECKIERGEEPPPEVFSTGDTKKQPKLQPPSPANSGSLQGPQTPQSTGS 1150 1160 1170 1180 1190 1200 1210 1220 1230 1240 1250 1260 pF1KE2 NSMAEVPGDLKPPTPASTPHGQMTPMQGGRSSTISVHDPFSDVSDSSFPKRNSMTPNAPY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 NSMAEVPGDLKPPTPASTPHGQMTPMQGGRSSTISVHDPFSDVSDSSFPKRNSMTPNAPY 1210 1220 1230 1240 1250 1260 1270 1280 1290 1300 1310 1320 pF1KE2 QQGMSMPDVMGRMPYEPNKDPFGGMRKVPGSSEPFMTQGQMPNSSMQDMYNQSPSGAMSN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 QQGMSMPDVMGRMPYEPNKDPFGGMRKVPGSSEPFMTQGQMPNSSMQDMYNQSPSGAMSN 1270 1280 1290 1300 1310 1320 1330 1340 1350 1360 1370 1380 pF1KE2 LGMGQRQQFPYGASYDRRHEPYGQQYPGQGPPSGQPPYGGHQPGLYPQQPNYKRHMDGMY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 LGMGQRQQFPYGASYDRRHEPYGQQYPGQGPPSGQPPYGGHQPGLYPQQPNYKRHMDGMY 1330 1340 1350 1360 1370 1380 1390 1400 1410 1420 1430 1440 pF1KE2 GPPAKRHEGDMYNMQYSSQQQEMYNQYGGSYSGPDRRPIQGQYPYPYSRERMQGPGQIQT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 GPPAKRHEGDMYNMQYSSQQQEMYNQYGGSYSGPDRRPIQGQYPYPYSRERMQGPGQIQT 1390 1400 1410 1420 1430 1440 1450 1460 1470 1480 1490 1500 pF1KE2 HGIPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPYQNRQGPGGPTQAPPYPGMNRTDDMM :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 HGIPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPYQNRQGPGGPTQAPPYPGMNRTDDMM 1450 1460 1470 1480 1490 1500 1510 1520 1530 1540 1550 1560 pF1KE2 VPDQRINHESQWPSHVSQRQPYMSSSASMQPITRPPQPSYQTPPSLPNHISRAPSPASFQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 VPDQRINHESQWPSHVSQRQPYMSSSASMQPITRPPQPSYQTPPSLPNHISRAPSPASFQ 1510 1520 1530 1540 1550 1560 1570 1580 1590 1600 1610 1620 pF1KE2 RSLENRMSPSKSPFLPSMKMQKVMPTVPTSQVTGPPPQPPPIRREITFPPGSVEASQPVL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 RSLENRMSPSKSPFLPSMKMQKVMPTVPTSQVTGPPPQPPPIRREITFPPGSVEASQPVL 1570 1580 1590 1600 1610 1620 1630 1640 1650 1660 1670 1680 pF1KE2 KQRRKITSKDIVTPEAWRVMMSLKSGLLAESTWALDTINILLYDDSTVATFNLSQLSGFL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 KQRRKITSKDIVTPEAWRVMMSLKSGLLAESTWALDTINILLYDDSTVATFNLSQLSGFL 1630 1640 1650 1660 1670 1680 1690 1700 1710 1720 1730 1740 pF1KE2 ELLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNAARKDDSQSLADDSGKEEEDAECIDD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 ELLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNAARKDDSQSLADDSGKEEEDAECIDD 1690 1700 1710 1720 1730 1740 1750 1760 1770 1780 1790 1800 pF1KE2 DEEDEEDEEEDSEKTESDEKSSIALTAPDAAADPKEKPKQASKFDKLPIKIVKKNNLFVV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 DEEDEEDEEEDSEKTESDEKSSIALTAPDAAADPKEKPKQASKFDKLPIKIVKKNNLFVV 1750 1760 1770 1780 1790 1800 1810 1820 1830 1840 1850 1860 pF1KE2 DRSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFESKMEIPPRRRPPPPLSSAGRKKEQE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 DRSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFESKMEIPPRRRPPPPLSSAGRKKEQE 1810 1820 1830 1840 1850 1860 1870 1880 1890 1900 1910 1920 pF1KE2 GKGDSEEQQEKSIIATIDDVLSARPGALPEDANPGPQTESSKFPFGIQQAKSHRNIKLLE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 GKGDSEEQQEKSIIATIDDVLSARPGALPEDANPGPQTESSKFPFGIQQAKSHRNIKLLE 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 pF1KE2 DEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSLSFVPGNDAEMSKHPGLVLILGKLIL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 DEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSLSFVPGNDAEMSKHPGLVLILGKLIL 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 2040 pF1KE2 LHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWDCLEVLRDNTLVTLANISGQLDLSAY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 LHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWDCLEVLRDNTLVTLANISGQLDLSAY 1990 2000 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100 pF1KE2 TESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLETLCKLSIQDNNVDLIL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 TESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLETLCKLSIQDNNVDLIL 2050 2060 2070 2080 2090 2100 2110 2120 2130 2140 2150 2160 pF1KE2 ATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALLSNLAQGDALAARAIAVQKGSIGNLI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 ATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALLSNLAQGDALAARAIAVQKGSIGNLI 2110 2120 2130 2140 2150 2160 2170 2180 2190 2200 2210 2220 pF1KE2 SFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMMCRAAKALLAMARVDENRSEFLLHEG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS55 SFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMMCRAAKALLAMARVDENRSEFLLHEG 2170 2180 2190 2200 2210 2220 2230 2240 pF1KE2 RLLDISISAVLNSLVASVICDVLFQIGQL ::::::::::::::::::::::::::::: CCDS55 RLLDISISAVLNSLVASVICDVLFQIGQL 2230 2240 >>CCDS5251.2 ARID1B gene_id:57492|Hs108|chr6 (2236 aa) initn: 11473 init1: 11473 opt: 12103 Z-score: 4681.2 bits: 880.0 E(32554): 0 Smith-Waterman score: 15489; 99.4% identity (99.4% similar) in 2249 aa overlap (1-2249:1-2236) 10 20 30 40 50 60 pF1KE2 MAHNAGAAAAAGTHSAKSGGSEAALKEGGSAAALSSSSSSSAAAAAASSSSSSGPGSAME :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 MAHNAGAAAAAGTHSAKSGGSEAALKEGGSAAALSSSSSSSAAAAAASSSSSSGPGSAME 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE2 TGLLPNHKLKTVGEAPAAPPHQQHHHHHHAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 TGLLPNHKLKTVGEAPAAPPHQQHHHHHHAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQ 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE2 QQQQQQQQQQQHPISNNNSLGGAGGGAPQPGPDMEQPQHGGAKDSAAGGQADPPGPPLLS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 QQQQQQQQQQQHPISNNNSLGGAGGGAPQPGPDMEQPQHGGAKDSAAGGQADPPGPPLLS 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE2 KPGDEDDAPPKMGEPAGGRYEHPGLGALGTQQPPVAVPGGGGGPAAVPEFNNYYGSAAPA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 KPGDEDDAPPKMGEPAGGRYEHPGLGALGTQQPPVAVPGGGGGPAAVPEFNNYYGSAAPA 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE2 SGGPGGRAGPCFDQHGGQQSPGMGMMHSASAAAAGAPGSMDPLQNSHEGYPNSQCNHYPG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 SGGPGGRAGPCFDQHGGQQSPGMGMMHSASAAAAGAPGSMDPLQNSHEGYPNSQCNHYPG 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE2 YSRPGAGGGGGGGGGGGGGSGGGGGGGGAGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 YSRPGAGGGGGGGGGGGGGSGGGGGGGGAGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGG 310 320 330 340 350 360 370 380 390 400 410 420 pF1KE2 SSAGYGVLSSPRQQGGGMMMGPGGGGAASLSKAAAGSAAGGFQRFAGQNQHPSGATPTLN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 SSAGYGVLSSPRQQGGGMMMGPGGGGAASLSKAAAGSAAGGFQRFAGQNQHPSGATPTLN 370 380 390 400 410 420 430 440 450 460 470 480 pF1KE2 QLLTSPSPMMRSYGGSYPEYSSPSAPPPPPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 QLLTSPSPMMRSYGGSYPEYSSPSAPPPPPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMG 430 440 450 460 470 480 490 500 510 520 530 540 pF1KE2 AQYAAASPAWAAAQQRSHPAMSPGTPGPTMGRSQGSPMDPMVMKRPQLYGMGSNPHSQPQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 AQYAAASPAWAAAQQRSHPAMSPGTPGPTMGRSQGSPMDPMVMKRPQLYGMGSNPHSQPQ 490 500 510 520 530 540 550 560 570 580 590 600 pF1KE2 QSSPYPGGSYGPPGPQRYPIGIQGRTPGAMAGMQYPQQQDSGDATWKETFWLMPPQYGQQ ::::::::::::::::::::::::::::::::::::::: :::::::: CCDS52 QSSPYPGGSYGPPGPQRYPIGIQGRTPGAMAGMQYPQQQ-------------MPPQYGQQ 550 560 570 580 610 620 630 640 650 660 pF1KE2 GVSGYCQQGQQPYYSQQPQPPHLPPQAQYLPSQSQQRYQPQQDMSQEGYGTRSQPPLAPG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 GVSGYCQQGQQPYYSQQPQPPHLPPQAQYLPSQSQQRYQPQQDMSQEGYGTRSQPPLAPG 590 600 610 620 630 640 670 680 690 700 710 720 pF1KE2 KPNHEDLNLIQQERPSSLPDLSGSIDDLPTGTEATLSSAVSASGSTSSQGDQSNPAQSPF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 KPNHEDLNLIQQERPSSLPDLSGSIDDLPTGTEATLSSAVSASGSTSSQGDQSNPAQSPF 650 660 670 680 690 700 730 740 750 760 770 780 pF1KE2 SPHASPHLSSIPGGPSPSPVGSPVGSNQSRSGPISPASIPGSQMPPQPPGSQSESSSHPA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 SPHASPHLSSIPGGPSPSPVGSPVGSNQSRSGPISPASIPGSQMPPQPPGSQSESSSHPA 710 720 730 740 750 760 790 800 810 820 830 840 pF1KE2 LSQSPMPQERGFMAGTQRNPQMAQYGPQQTGPSMSPHPSPGGQMHAGISSFQQSNSSGTY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 LSQSPMPQERGFMAGTQRNPQMAQYGPQQTGPSMSPHPSPGGQMHAGISSFQQSNSSGTY 770 780 790 800 810 820 850 860 870 880 890 900 pF1KE2 GPQMSQYGPQGNYSRPPAYSGVPSASYSGPGPGMGISANNQMHGQGPSQPCGAVPLGRMP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 GPQMSQYGPQGNYSRPPAYSGVPSASYSGPGPGMGISANNQMHGQGPSQPCGAVPLGRMP 830 840 850 860 870 880 910 920 930 940 950 960 pF1KE2 SAGMQNRPFPGNMSSMTPSSPGMSQQGGPGMGPPMPTVNRKAQEAAAAVMQAAANSAQSR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 SAGMQNRPFPGNMSSMTPSSPGMSQQGGPGMGPPMPTVNRKAQEAAAAVMQAAANSAQSR 890 900 910 920 930 940 970 980 990 1000 1010 1020 pF1KE2 QGSFPGMNQSGLMASSSPYSQPMNNSSSLMNTQAPPYSMAPAMVNSSAASVGLADMMSPG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 QGSFPGMNQSGLMASSSPYSQPMNNSSSLMNTQAPPYSMAPAMVNSSAASVGLADMMSPG 950 960 970 980 990 1000 1030 1040 1050 1060 1070 1080 pF1KE2 ESKLPLPLKADGKEEGTPQPESKSKKSSSSTTTGEKITKVYELGNEPERKLWVDRYLTFM :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 ESKLPLPLKADGKEEGTPQPESKSKKSSSSTTTGEKITKVYELGNEPERKLWVDRYLTFM 1010 1020 1030 1040 1050 1060 1090 1100 1110 1120 1130 1140 pF1KE2 EERGSPVSSLPAVGKKPLDLFRLYVCVKEIGGLAQVNKNKKWRELATNLNVGTSSSAASS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 EERGSPVSSLPAVGKKPLDLFRLYVCVKEIGGLAQVNKNKKWRELATNLNVGTSSSAASS 1070 1080 1090 1100 1110 1120 1150 1160 1170 1180 1190 1200 pF1KE2 LKKQYIQYLFAFECKIERGEEPPPEVFSTGDTKKQPKLQPPSPANSGSLQGPQTPQSTGS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 LKKQYIQYLFAFECKIERGEEPPPEVFSTGDTKKQPKLQPPSPANSGSLQGPQTPQSTGS 1130 1140 1150 1160 1170 1180 1210 1220 1230 1240 1250 1260 pF1KE2 NSMAEVPGDLKPPTPASTPHGQMTPMQGGRSSTISVHDPFSDVSDSSFPKRNSMTPNAPY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 NSMAEVPGDLKPPTPASTPHGQMTPMQGGRSSTISVHDPFSDVSDSSFPKRNSMTPNAPY 1190 1200 1210 1220 1230 1240 1270 1280 1290 1300 1310 1320 pF1KE2 QQGMSMPDVMGRMPYEPNKDPFGGMRKVPGSSEPFMTQGQMPNSSMQDMYNQSPSGAMSN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 QQGMSMPDVMGRMPYEPNKDPFGGMRKVPGSSEPFMTQGQMPNSSMQDMYNQSPSGAMSN 1250 1260 1270 1280 1290 1300 1330 1340 1350 1360 1370 1380 pF1KE2 LGMGQRQQFPYGASYDRRHEPYGQQYPGQGPPSGQPPYGGHQPGLYPQQPNYKRHMDGMY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 LGMGQRQQFPYGASYDRRHEPYGQQYPGQGPPSGQPPYGGHQPGLYPQQPNYKRHMDGMY 1310 1320 1330 1340 1350 1360 1390 1400 1410 1420 1430 1440 pF1KE2 GPPAKRHEGDMYNMQYSSQQQEMYNQYGGSYSGPDRRPIQGQYPYPYSRERMQGPGQIQT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 GPPAKRHEGDMYNMQYSSQQQEMYNQYGGSYSGPDRRPIQGQYPYPYSRERMQGPGQIQT 1370 1380 1390 1400 1410 1420 1450 1460 1470 1480 1490 1500 pF1KE2 HGIPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPYQNRQGPGGPTQAPPYPGMNRTDDMM :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 HGIPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPYQNRQGPGGPTQAPPYPGMNRTDDMM 1430 1440 1450 1460 1470 1480 1510 1520 1530 1540 1550 1560 pF1KE2 VPDQRINHESQWPSHVSQRQPYMSSSASMQPITRPPQPSYQTPPSLPNHISRAPSPASFQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 VPDQRINHESQWPSHVSQRQPYMSSSASMQPITRPPQPSYQTPPSLPNHISRAPSPASFQ 1490 1500 1510 1520 1530 1540 1570 1580 1590 1600 1610 1620 pF1KE2 RSLENRMSPSKSPFLPSMKMQKVMPTVPTSQVTGPPPQPPPIRREITFPPGSVEASQPVL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 RSLENRMSPSKSPFLPSMKMQKVMPTVPTSQVTGPPPQPPPIRREITFPPGSVEASQPVL 1550 1560 1570 1580 1590 1600 1630 1640 1650 1660 1670 1680 pF1KE2 KQRRKITSKDIVTPEAWRVMMSLKSGLLAESTWALDTINILLYDDSTVATFNLSQLSGFL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 KQRRKITSKDIVTPEAWRVMMSLKSGLLAESTWALDTINILLYDDSTVATFNLSQLSGFL 1610 1620 1630 1640 1650 1660 1690 1700 1710 1720 1730 1740 pF1KE2 ELLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNAARKDDSQSLADDSGKEEEDAECIDD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 ELLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNAARKDDSQSLADDSGKEEEDAECIDD 1670 1680 1690 1700 1710 1720 1750 1760 1770 1780 1790 1800 pF1KE2 DEEDEEDEEEDSEKTESDEKSSIALTAPDAAADPKEKPKQASKFDKLPIKIVKKNNLFVV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 DEEDEEDEEEDSEKTESDEKSSIALTAPDAAADPKEKPKQASKFDKLPIKIVKKNNLFVV 1730 1740 1750 1760 1770 1780 1810 1820 1830 1840 1850 1860 pF1KE2 DRSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFESKMEIPPRRRPPPPLSSAGRKKEQE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 DRSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFESKMEIPPRRRPPPPLSSAGRKKEQE 1790 1800 1810 1820 1830 1840 1870 1880 1890 1900 1910 1920 pF1KE2 GKGDSEEQQEKSIIATIDDVLSARPGALPEDANPGPQTESSKFPFGIQQAKSHRNIKLLE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 GKGDSEEQQEKSIIATIDDVLSARPGALPEDANPGPQTESSKFPFGIQQAKSHRNIKLLE 1850 1860 1870 1880 1890 1900 1930 1940 1950 1960 1970 1980 pF1KE2 DEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSLSFVPGNDAEMSKHPGLVLILGKLIL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 DEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSLSFVPGNDAEMSKHPGLVLILGKLIL 1910 1920 1930 1940 1950 1960 1990 2000 2010 2020 2030 2040 pF1KE2 LHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWDCLEVLRDNTLVTLANISGQLDLSAY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 LHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWDCLEVLRDNTLVTLANISGQLDLSAY 1970 1980 1990 2000 2010 2020 2050 2060 2070 2080 2090 2100 pF1KE2 TESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLETLCKLSIQDNNVDLIL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 TESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLETLCKLSIQDNNVDLIL 2030 2040 2050 2060 2070 2080 2110 2120 2130 2140 2150 2160 pF1KE2 ATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALLSNLAQGDALAARAIAVQKGSIGNLI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 ATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALLSNLAQGDALAARAIAVQKGSIGNLI 2090 2100 2110 2120 2130 2140 2170 2180 2190 2200 2210 2220 pF1KE2 SFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMMCRAAKALLAMARVDENRSEFLLHEG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 SFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMMCRAAKALLAMARVDENRSEFLLHEG 2150 2160 2170 2180 2190 2200 2230 2240 pF1KE2 RLLDISISAVLNSLVASVICDVLFQIGQL ::::::::::::::::::::::::::::: CCDS52 RLLDISISAVLNSLVASVICDVLFQIGQL 2210 2220 2230 >>CCDS285.1 ARID1A gene_id:8289|Hs108|chr1 (2285 aa) initn: 5031 init1: 1919 opt: 3736 Z-score: 1456.1 bits: 283.3 E(32554): 8.7e-75 Smith-Waterman score: 7221; 52.3% identity (70.3% similar) in 2354 aa overlap (116-2248:25-2284) 90 100 110 120 130 140 pF1KE2 HHHHAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQQQQQQQQQQQQHPISNNNSLGGAGG .. .:::... . . . . :.. CCDS28 MAAQVAPAAASSLGNPPPPPPSELKKAEQQQREEAGGEAAAAAAAERGEMKAAA 10 20 30 40 50 150 160 170 180 190 200 pF1KE2 GAPQPGPDMEQPQHGGAKDSAAGGQADPPGPPLLSKPGDEDDAPPKMGEPAGGRYEHPGL : . :: . :: : :. :.... : . : : : . . :. .:.: CCDS28 GQESEGPAVGPPQPLG-KELQDGAESNGGGGGGGAGSGGGPGAEPDLKNSNGNAGPRPAL 60 70 80 90 100 110 210 220 230 240 250 pF1KE2 GALGTQQPPVAVPGGGGGPA---AVPEFNNYYGSAAPASG-G-PGGR--------AGPCF . . .:: ::::: . ..: . . :: : : : :: :. : CCDS28 NN-NLTEPP---GGGGGGSSDGVGAPPHSAAAALPPPAYGFGQPYGRSPSAVAAAAAAVF 120 130 140 150 160 260 270 280 290 300 310 pF1KE2 -DQHGGQQSPGMGMMHSASAAAAGAPGSMDPLQNSHE-GYPNSQCNHYPGYSRPGAGGGG .:::::::::.. ..:... .: : ::::. :.:: : : : : .: CCDS28 HQQHGGQQSPGLAALQSGGG--GGLEPYAGPQQNSHDHGFPNHQYNSY--YPNRSAYPPP 170 180 190 200 210 220 320 330 340 350 360 370 pF1KE2 GGGGGGGGGSGGGGGGGGAGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGGSSAGYGVLSS . . . .. :: :.:.:.:.:. ..:.:.... :: CCDS28 APAYALSSPRGGTPGSGAAAAAGSKPPPSSSASASSSS--------------------SS 230 240 250 260 380 390 400 410 420 430 pF1KE2 PRQQGGGMMMGPGGGGAASLSKAAAGSAAGGFQRFAGQNQHPSGATPTLNQLLTSPSPMM :: : : :::: ::::: : : :. ::::::::::::: CCDS28 FAQQRFGAM---GGGGP---------SAAGG-----GTPQ-PT-ATPTLNQLLTSPSSA- 270 280 290 300 440 450 460 470 480 pF1KE2 RSYGGSYP--EYSSPSAPPPPPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMGAQYAAASP :.: : :: .::. ::. .:. : : ..: .. .: :::. CCDS28 RGYQG-YPGGDYSGG---------PQDGGAGKGPADMASQCWGA-------AAAAAAAAA 310 320 330 340 490 500 510 520 530 pF1KE2 AWAAAQQRSHPA-MSPGTPG----PTMGRSQ-GSPMDPMVMKRPQLYGMGSNPHSQ---- : ..:::::: : ::::. : : : .:::: : ::: :: :.::.:: CCDS28 ASGGAQQRSHHAPMSPGSSGGGGQPLARTPQPSSPMDQMGKMRPQPYG-GTNPYSQQQGP 350 360 370 380 390 400 540 550 560 570 580 pF1KE2 ---PQQSSPYPGGSYGPPGPQRYPIGIQGRTPGAMAGMQY----P---QQQDSGDATWKE :::. ::: :: :::::. .:::. .::.:..: : :: :: . . CCDS28 PSGPQQGHGYPGQPYGSQTPQRYPMTMQGRAQSAMGGLSYTQQIPPYGQQGPSGYGQQGQ 410 420 430 440 450 460 590 600 610 620 pF1KE2 TFWL-----MP----PQYGQQGVS-------GYCQQGQ---------QPYYSQQP-QPPH : . : : :.:: : .: :: : :: ::::: :::: CCDS28 TPYYNQQSPHPQQQQPPYSQQPPSQTPHAQPSYQQQPQSQPPQLQSSQPPYSQQPSQPPH 470 480 490 500 510 520 630 pF1KE2 ------LP------------------PQAQ-----YLPSQ-------------------- : :::: :.: CCDS28 QQSPAPYPSQQSTTQQHPQSQPPYSQPQAQSPYQQQQPQQPAPSTLSQQAAYPQPQSQQS 530 540 550 560 570 580 640 650 660 670 680 pF1KE2 -----SQQRYQPQQDMSQEGYGTR--SQPPLAPGKPNHEDLNLIQQERPSSLPDLSGSID ::::. : :..::...:.. : : .. .: ..::.:: : ::::::::::::: CCDS28 QQTAYSQQRFPPPQELSQDSFGSQASSAPSMTSSKGGQEDMNLSLQSRPSSLPDLSGSID 590 600 610 620 630 640 690 700 710 720 730 740 pF1KE2 DLPTGTEATLSSAVSASGSTSSQGDQSNPAQSPFSPHASPHLSSIPGGPSPSPVGSPVGS ::: :::..:: .::.:: .::::.::::::::::::.:::: .: : :::::::::.. CCDS28 DLPMGTEGALSPGVSTSGISSSQGEQSNPAQSPFSPHTSPHLPGIRG-PSPSPVGSPASV 650 660 670 680 690 700 750 760 770 780 790 800 pF1KE2 NQSRSGPISPASIPGSQMPPQPPGSQSESSSHPALSQSPMPQERGFMAGTQRNPQMAQYG ::::::.:::..::.::::.::..::.: ::...:: . :.::.: :::::: ::. CCDS28 AQSRSGPLSPAAVPGNQMPPRPPSGQSDSIMHPSMNQSSIAQDRGYM---QRNPQMPQYS 710 720 730 740 750 760 810 820 830 840 850 860 pF1KE2 PQQTGPSMSPHPSPGGQMHAGISSFQQSNSSGTYGPQMSQYGPQGNYSRPPAYSGVPSAS : : ..::. :::.:.:..:.:: :: :.:::: .::::::.: : : :...:.:. CCDS28 SPQPGSALSPRQPSGGQIHTGMGSYQQ-NSMGSYGPQGGQYGPQGGYPRQPNYNALPNAN 770 780 790 800 810 820 870 880 890 900 910 920 pF1KE2 YSGPGPGMGIS---ANNQMHGQGPSQPCGAVPLGRMPSAGMQNRPFPGNMSSMTPSSPGM : . : . ::. :..::::: : :..: ::: :.: :::. ::..: : CCDS28 YPSAGMAGGINPMGAGGQMHGQPGIPPYGTLPPGRMSHASMGNRPYGPNMANMPP----- 830 840 850 860 870 930 940 950 960 970 980 pF1KE2 SQQGGPGMGPPMPTVNRKAQEAAAAVMQAAANSAQSRQGSFPGMNQSGLMASSSPYSQPM : : :: :: .:::.::.:.: :..:::: :.: ..:.:::.:.:... ::.: . CCDS28 --QVGSGMCPPPGGMNRKTQETAVA-MHVAANSIQNRPPGYPNMNQGGMMGTGPPYGQGI 880 890 900 910 920 930 990 1000 1010 1020 1030 1040 pF1KE2 NNSSSLMNTQAPPYSMAPAMVNSSAASVGLADMMSPGESKLPLPLKADGKEEGTPQPESK :. ....: :.:::::. .:.:.::. .. .::. :. :: : ..: .:::. ::: CCDS28 NSMAGMINPQGPPYSMGGTMANNSAGMAASPEMMGLGDVKLTPATKMNNKADGTPKTESK 940 950 960 970 980 990 1050 1060 1070 1080 1090 1100 pF1KE2 SKKSSSSTTTGEKITKVYELGNEPERKLWVDRYLTFMEERGSPVSSLPAVGKKPLDLFRL ::::::::::.:::::.::::.:::::.::::::.: ::.. ...:::::.:::::.:: CCDS28 SKKSSSSTTTNEKITKLYELGGEPERKMWVDRYLAFTEEKAMGMTNLPAVGRKPLDLYRL 1000 1010 1020 1030 1040 1050 1110 1120 1130 1140 1150 1160 pF1KE2 YVCVKEIGGLAQVNKNKKWRELATNLNVGTSSSAASSLKKQYIQYLFAFECKIERGEEPP :: :::::::.::::::::::::::::::::::::::::::::: :.::::::::::.:: CCDS28 YVSVKEIGGLTQVNKNKKWRELATNLNVGTSSSAASSLKKQYIQCLYAFECKIERGEDPP 1060 1070 1080 1090 1100 1110 1170 1180 1190 1200 1210 1220 pF1KE2 PEVFSTGDTKK-QPKLQPPSPANSGSLQGPQTPQSTGSNSMAEVPGDLKPPTPASTPHGQ :..:...:.:: :::.::::::.:::.::::::::: :.:::: :::::::::::::.: CCDS28 PDIFAAADSKKSQPKIQPPSPAGSGSMQGPQTPQST-SSSMAE-GGDLKPPTPASTPHSQ 1120 1130 1140 1150 1160 1170 1230 1240 1250 1260 1270 1280 pF1KE2 MTPMQG-GRSSTISVHDPFSDVSDSSFPKRNSMTPNAPYQQGMSMPDVMGRMPYEPNKDP . :. : .::......: :.: :::.: :::::::: :: .:. :.:::: ::::::: CCDS28 IPPLPGMSRSNSVGIQDAFNDGSDSTFQKRNSMTPNPGYQPSMNTSDMMGRMSYEPNKDP 1180 1190 1200 1210 1220 1230 1290 1300 1310 1320 1330 pF1KE2 FGGMRKVPGSSEPFMTQGQMPNSSMQDMYNQSPSGAMSNLGMGQRQQFPYGASYDR---- .:.:::.::: .:::..:: ::..: : :... . ...:..:: ::..:::. ::: CCDS28 YGSMRKAPGS-DPFMSSGQGPNGGMGDPYSRAAGPGLGNVAMGPRQHYPYGGPYDRVRTE 1240 1250 1260 1270 1280 1290 1340 1350 pF1KE2 -------------------------------------------RHEPYGQQYPGQGPPSG ::. ::.:. :: ::: CCDS28 PGIGPEGNMSTGAPQPNLMPSNPDSGMYSPSRYPPQQQQQQQQRHDSYGNQFSTQGTPSG 1300 1310 1320 1330 1340 1350 1360 1370 1380 1390 pF1KE2 QPPYGGHQPGLYPQQP-NYKRHMDGMYGPPAKRHEGDMYNMQYS---------------- .: . ..: .: :: :::: ::: ::::::::::.::.. :: CCDS28 SP-FPSQQTTMYQQQQQNYKRPMDGTYGPPAKRHEGEMYSVPYSTGQGQPQQQQLPPAQP 1360 1370 1380 1390 1400 1410 1400 1410 1420 1430 pF1KE2 -----------SQQQEMYNQYGGSY-----SGPDRRPI---QGQYPYPYSRERMQGP-GQ : ::..:::::..: .. .::: :.:.:. ..:.:...: : CCDS28 QPASQQQAAQPSPQQDVYNQYGNAYPATATAATERRPAGGPQNQFPFQFGRDRVSAPPGT 1420 1430 1440 1450 1460 1470 1440 1450 1460 1470 1480 1490 pF1KE2 IQTHGIPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPYQNRQGPGGPTQAPPYPGMNRTD ...::::::::.:.:. . : .:: .:::: : : :::. :. :.: : :.:::: CCDS28 NAQQNMPPQMMGGPIQASAEVAQQGTMWQGRNDMTYNYANRQSTGSAPQGPAYHGVNRTD 1480 1490 1500 1510 1520 1530 1500 1510 1520 1530 1540 1550 pF1KE2 DMMVPDQRINHESQWPSHVSQRQPYMSSSASMQPITRPPQPSYQTPPSLPNHISRAPSPA .:. ::: :::..:::: . ::: .. :: . :.:::: .:: :::. ::: .. ::: CCDS28 EMLHTDQRANHEGSWPSH-GTRQPPYGPSAPVPPMTRPPPSNYQPPPSMQNHIPQVSSPA 1540 1550 1560 1570 1580 1560 1570 1580 1590 1600 1610 pF1KE2 SFQRSLENRMSPSKSPFLPS-MKMQKVMPTVPTSQVTGPPPQPPPIRREITFPPGSVEAS . : .::: :::::::: : :::::. : ::.:... : ::: :::.::::::::::. CCDS28 PLPRPMENRTSPSKSPFLHSGMKMQKAGPPVPASHIAPAPVQPPMIRRDITFPPGSVEAT 1590 1600 1610 1620 1630 1640 1620 1630 1640 1650 1660 1670 pF1KE2 QPVLKQRRKITSKDIVTPEAWRVMMSLKSGLLAESTWALDTINILLYDDSTVATFNLSQL ::::::::..: ::: :::::::::::::::::::::::::::::::::... ::::::: CCDS28 QPVLKQRRRLTMKDIGTPEAWRVMMSLKSGLLAESTWALDTINILLYDDNSIMTFNLSQL 1650 1660 1670 1680 1690 1700 1680 1690 1700 1710 1720 1730 pF1KE2 SGFLELLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNAARKDDSQSLADDSGKEEEDAE :.:::::::::.:::.::::: :::::::.:..: . .: . .: : : :::. : CCDS28 PGLLELLVEYFRRCLIEIFGILKEYEVGDPGQRTL-LDPGRFSKVSSPAPMEGGEEEE-E 1710 1720 1730 1740 1750 1760 1740 1750 1760 1770 1780 1790 pF1KE2 CIDDDEEDEEDEEEDSEKTESDEKSSIALTAPDAAADPKEKPKQASKFDKLPIKIVKKNN . :.::.:: .:.::. ::... : :. . . : :::::::.:::.::. CCDS28 LLGPKLEEEEEEEV----VENDEE--IAFSGKDKPASENSEEKLISKFDKLPVKIVQKND 1770 1780 1790 1800 1810 1820 1800 1810 1820 1830 1840 1850 pF1KE2 LFVVDRSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFESKMEIPPRRRPPPPLSSAGRK :::: ::::::::::.::::::..:::::::::::::::: :. : : : : : :: CCDS28 PFVVDCSDKLGRVQEFDSGLLHWRIGGGDTTEHIQTHFESKTELLPSR-PHAPCPPAPRK 1830 1840 1850 1860 1870 1880 1860 1870 1880 1890 1900 pF1KE2 K--EQEGKGDSEEQQ--------EKSIIATIDDVLSARPGALPEDANPGPQT--ESSKFP . :: . .:. :: : ::.::.::.: ..: ::. . .. :::::: CCDS28 HVTTAEGTPGTTDQEGPPPDGPPEKRITATMDDMLSTRSSTLTEDGAKSSEAIKESSKFP 1890 1900 1910 1920 1930 1940 1910 1920 1930 1940 1950 1960 pF1KE2 FGIQQAKSHRNIKLLEDEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSLSFVPGNDAE :::. :.::::::.:::::.:.:::::::. :::::::::.:::: .:::::::::: : CCDS28 FGISPAQSHRNIKILEDEPHSKDETPLCTLLDWQDSLAKRCVCVSNTIRSLSFVPGNDFE 1950 1960 1970 1980 1990 2000 1970 1980 1990 2000 2010 2020 pF1KE2 MSKHPGLVLILGKLILLHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWDCLEVLRDNT :::::::.:::::::::::.:::::.:: ::::::..:.::.:.: ::::::::.::.:: CCDS28 MSKHPGLLLILGKLILLHHKHPERKQAPLTYEKEEEQDQGVSCNKVEWWWDCLEMLRENT 2010 2020 2030 2040 2050 2060 2030 2040 2050 2060 2070 2080 pF1KE2 LVTLANISGQLDLSAYTESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLE :::::::::::::: : ::::::.::::::: ::::::::::: :.:::.:::::::::: CCDS28 LVTLANISGQLDLSPYPESICLPVLDGLLHWAVCPSAEAQDPFSTLGPNAVLSPQRLVLE 2070 2080 2090 2100 2110 2120 2090 2100 2110 2120 2130 2140 pF1KE2 TLCKLSIQDNNVDLILATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALLSNLAQGDAL :: :::::::::::::::::::: ::.:.:.::...::::::::::...::.::::::.: CCDS28 TLSKLSIQDNNVDLILATPPFSRLEKLYSTMVRFLSDRKNPVCREMAVVLLANLAQGDSL 2130 2140 2150 2160 2170 2180 2150 2160 2170 2180 2190 2200 pF1KE2 AARAIAVQKGSIGNLISFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMMCRAAKALLA :::::::::::::::..::::... .:.:::: .:.::: ::.:: ::::: :::.:::: CCDS28 AARAIAVQKGSIGNLLGFLEDSLAATQFQQSQASLLHMQNPPFEPTSVDMMRRAARALLA 2190 2200 2210 2220 2230 2240 2210 2220 2230 2240 pF1KE2 MARVDENRSEFLLHEGRLLDISISAVLNSLVASVICDVLFQIGQL .:.::::.::: :.:.::::::.: ..::::..::::::: ::: CCDS28 LAKVDENHSEFTLYESRLLDISVSPLMNSLVSQVICDVLFLIGQS 2250 2260 2270 2280 >>CCDS44091.1 ARID1A gene_id:8289|Hs108|chr1 (2068 aa) initn: 4258 init1: 1919 opt: 3359 Z-score: 1311.4 bits: 256.4 E(32554): 1e-66 Smith-Waterman score: 6479; 50.9% identity (68.3% similar) in 2271 aa overlap (116-2248:25-2067) 90 100 110 120 130 140 pF1KE2 HHHHAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQQQQQQQQQQQQHPISNNNSLGGAGG .. .:::... . . . . :.. CCDS44 MAAQVAPAAASSLGNPPPPPPSELKKAEQQQREEAGGEAAAAAAAERGEMKAAA 10 20 30 40 50 150 160 170 180 190 200 pF1KE2 GAPQPGPDMEQPQHGGAKDSAAGGQADPPGPPLLSKPGDEDDAPPKMGEPAGGRYEHPGL : . :: . :: : :. :.... : . : : : . . :. .:.: CCDS44 GQESEGPAVGPPQPLG-KELQDGAESNGGGGGGGAGSGGGPGAEPDLKNSNGNAGPRPAL 60 70 80 90 100 110 210 220 230 240 250 pF1KE2 GALGTQQPPVAVPGGGGGPA---AVPEFNNYYGSAAPASG-G-PGGR--------AGPCF . . .:: ::::: . ..: . . :: : : : :: :. : CCDS44 NN-NLTEPP---GGGGGGSSDGVGAPPHSAAAALPPPAYGFGQPYGRSPSAVAAAAAAVF 120 130 140 150 160 260 270 280 290 300 310 pF1KE2 -DQHGGQQSPGMGMMHSASAAAAGAPGSMDPLQNSHE-GYPNSQCNHYPGYSRPGAGGGG .:::::::::.. ..:... .: : ::::. :.:: : : : : .: CCDS44 HQQHGGQQSPGLAALQSGGG--GGLEPYAGPQQNSHDHGFPNHQYNSY--YPNRSAYPPP 170 180 190 200 210 220 320 330 340 350 360 370 pF1KE2 GGGGGGGGGSGGGGGGGGAGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGGSSAGYGVLSS . . . .. :: :.:.:.:.:. ..:.:.... :: CCDS44 APAYALSSPRGGTPGSGAAAAAGSKPPPSSSASASSSS--------------------SS 230 240 250 260 380 390 400 410 420 430 pF1KE2 PRQQGGGMMMGPGGGGAASLSKAAAGSAAGGFQRFAGQNQHPSGATPTLNQLLTSPSPMM :: : : :::: ::::: : : :. ::::::::::::: CCDS44 FAQQRFGAM---GGGGP---------SAAGG-----GTPQ-PT-ATPTLNQLLTSPSSA- 270 280 290 300 440 450 460 470 480 pF1KE2 RSYGGSYP--EYSSPSAPPPPPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMGAQYAAASP :.: : :: .::. ::. .:. : : ..: .. .: :::. CCDS44 RGYQG-YPGGDYSGG---------PQDGGAGKGPADMASQCWGA-------AAAAAAAAA 310 320 330 340 490 500 510 520 530 pF1KE2 AWAAAQQRSHPA-MSPGTPG----PTMGRSQ-GSPMDPMVMKRPQLYGMGSNPHSQ---- : ..:::::: : ::::. : : : .:::: : ::: :: :.::.:: CCDS44 ASGGAQQRSHHAPMSPGSSGGGGQPLARTPQPSSPMDQMGKMRPQPYG-GTNPYSQQQGP 350 360 370 380 390 400 540 550 560 570 580 pF1KE2 ---PQQSSPYPGGSYGPPGPQRYPIGIQGRTPGAMAGMQY----P---QQQDSGDATWKE :::. ::: :: :::::. .:::. .::.:..: : :: :: . . CCDS44 PSGPQQGHGYPGQPYGSQTPQRYPMTMQGRAQSAMGGLSYTQQIPPYGQQGPSGYGQQGQ 410 420 430 440 450 460 590 600 610 620 pF1KE2 TFWL-----MP----PQYGQQGVS-------GYCQQGQ---------QPYYSQQP-QPPH : . : : :.:: : .: :: : :: ::::: :::: CCDS44 TPYYNQQSPHPQQQQPPYSQQPPSQTPHAQPSYQQQPQSQPPQLQSSQPPYSQQPSQPPH 470 480 490 500 510 520 630 pF1KE2 ------LP------------------PQAQ-----YLPSQ-------------------- : :::: :.: CCDS44 QQSPAPYPSQQSTTQQHPQSQPPYSQPQAQSPYQQQQPQQPAPSTLSQQAAYPQPQSQQS 530 540 550 560 570 580 640 650 660 670 680 pF1KE2 -----SQQRYQPQQDMSQEGYGTR--SQPPLAPGKPNHEDLNLIQQERPSSLPDLSGSID ::::. : :..::...:.. : : .. .: ..::.:: : ::::::::::::: CCDS44 QQTAYSQQRFPPPQELSQDSFGSQASSAPSMTSSKGGQEDMNLSLQSRPSSLPDLSGSID 590 600 610 620 630 640 690 700 710 720 730 740 pF1KE2 DLPTGTEATLSSAVSASGSTSSQGDQSNPAQSPFSPHASPHLSSIPGGPSPSPVGSPVGS ::: :::..:: .::.:: .::::.::::::::::::.:::: .: : :::::::::.. CCDS44 DLPMGTEGALSPGVSTSGISSSQGEQSNPAQSPFSPHTSPHLPGIRG-PSPSPVGSPASV 650 660 670 680 690 700 750 760 770 780 790 800 pF1KE2 NQSRSGPISPASIPGSQMPPQPPGSQSESSSHPALSQSPMPQERGFMAGTQRNPQMAQYG ::::::.:::..::.::::.::..::.: ::...:: . :.::.: :::::: ::. CCDS44 AQSRSGPLSPAAVPGNQMPPRPPSGQSDSIMHPSMNQSSIAQDRGYM---QRNPQMPQYS 710 720 730 740 750 760 810 820 830 840 850 860 pF1KE2 PQQTGPSMSPHPSPGGQMHAGISSFQQSNSSGTYGPQMSQYGPQGNYSRPPAYSGVPSAS : : ..::. :::.:.:..:.:: :: :.:::: .::::::.: : : :...:.:. CCDS44 SPQPGSALSPRQPSGGQIHTGMGSYQQ-NSMGSYGPQGGQYGPQGGYPRQPNYNALPNAN 770 780 790 800 810 820 870 880 890 900 910 920 pF1KE2 YSGPGPGMGIS---ANNQMHGQGPSQPCGAVPLGRMPSAGMQNRPFPGNMSSMTPSSPGM : . : . ::. :..::::: : :..: ::: :.: :::. ::..: : CCDS44 YPSAGMAGGINPMGAGGQMHGQPGIPPYGTLPPGRMSHASMGNRPYGPNMANMPP----- 830 840 850 860 870 930 940 950 960 970 980 pF1KE2 SQQGGPGMGPPMPTVNRKAQEAAAAVMQAAANSAQSRQGSFPGMNQSGLMASSSPYSQPM : : :: :: .:::.::.:.: :..:::: :.: ..:.:::.:.:... ::.: . CCDS44 --QVGSGMCPPPGGMNRKTQETAVA-MHVAANSIQNRPPGYPNMNQGGMMGTGPPYGQGI 880 890 900 910 920 930 990 1000 1010 1020 1030 1040 pF1KE2 NNSSSLMNTQAPPYSMAPAMVNSSAASVGLADMMSPGESKLPLPLKADGKEEGTPQPESK :. ....: :.:::::. .:.:.::. .. .::. :. :: : ..: .:::. ::: CCDS44 NSMAGMINPQGPPYSMGGTMANNSAGMAASPEMMGLGDVKLTPATKMNNKADGTPKTESK 940 950 960 970 980 990 1050 1060 1070 1080 1090 1100 pF1KE2 SKKSSSSTTTGEKITKVYELGNEPERKLWVDRYLTFMEERGSPVSSLPAVGKKPLDLFRL ::::::::::.:::::.::::.:::::.::::::.: ::.. ...:::::.:::::.:: CCDS44 SKKSSSSTTTNEKITKLYELGGEPERKMWVDRYLAFTEEKAMGMTNLPAVGRKPLDLYRL 1000 1010 1020 1030 1040 1050 1110 1120 1130 1140 1150 1160 pF1KE2 YVCVKEIGGLAQVNKNKKWRELATNLNVGTSSSAASSLKKQYIQYLFAFECKIERGEEPP :: :::::::.::::::::::::::::::::::::::::::::: :.::::::::::.:: CCDS44 YVSVKEIGGLTQVNKNKKWRELATNLNVGTSSSAASSLKKQYIQCLYAFECKIERGEDPP 1060 1070 1080 1090 1100 1110 1170 1180 1190 1200 1210 1220 pF1KE2 PEVFSTGDTKK-QPKLQPPSPANSGSLQGPQTPQSTGSNSMAEVPGDLKPPTPASTPHGQ :..:...:.:: :::.::::::.:::.::::::::: :.:::: :::::::::::::.: CCDS44 PDIFAAADSKKSQPKIQPPSPAGSGSMQGPQTPQST-SSSMAE-GGDLKPPTPASTPHSQ 1120 1130 1140 1150 1160 1170 1230 1240 1250 1260 1270 1280 pF1KE2 MTPMQG-GRSSTISVHDPFSDVSDSSFPKRNSMTPNAPYQQGMSMPDVMGRMPYEPNKDP . :. : .::......: :.: :::.: :::::::: :: .:. :.:::: ::::::: CCDS44 IPPLPGMSRSNSVGIQDAFNDGSDSTFQKRNSMTPNPGYQPSMNTSDMMGRMSYEPNKDP 1180 1190 1200 1210 1220 1230 1290 1300 1310 1320 1330 1340 pF1KE2 FGGMRKVPGSSEPFMTQGQMPNSSMQDMYNQSPSGAMSNLGMGQRQQFPYGASYDR-RHE .:.:::.::: .:::..:: ::..: : :... . ...:..:: ::..:::. ::: : : CCDS44 YGSMRKAPGS-DPFMSSGQGPNGGMGDPYSRAAGPGLGNVAMGPRQHYPYGGPYDRVRTE 1240 1250 1260 1270 1280 1290 1350 1360 1370 1380 1390 1400 pF1KE2 PYGQQYPGQGPPSGQPPYGGHQPGLYPQQPNYKRHMDGMYGPPAKRHEGDMYNMQYSSQQ :: :: :. :. ::.:.:..:. .:::.: . : : ..:: CCDS44 ------PGIGPE-GNMSTGAPQPNLMPSNPD-----SGMYSP-------SRYPPQQQQQQ 1300 1310 1320 1330 1410 1420 1430 1440 1450 1460 pF1KE2 QEMYNQYGGSYSGPDRRPIQGQYPYPYSRERMQGPGQIQTHGIPPQMMGGPLQSSSSEGP :. ...::...: :.: : :.: CCDS44 QQRHDSYGNQFS---------------------------TQGTP----------SGS--- 1340 1350 1470 1480 1490 1500 1510 1520 pF1KE2 QQNMWAARNDMPYPYQNRQGPGGPTQAPPYPGMNRTDDMMVPDQRINHESQWPSHVSQRQ :.: : :. : :.: CCDS44 -----------PFPSQ---------QTTMY---------------------------QQQ 1360 1530 1540 1550 1560 1570 pF1KE2 PYMSSSASMQPITRPPQPSYQTPPSLPNHISRAPSPASFQRSLENRMSPSKSPFLPS-MK .:: : :. :: .::: :::::::: : :: CCDS44 QQVSSPA---PLPRP---------------------------MENRTSPSKSPFLHSGMK 1370 1380 1390 1580 1590 1600 1610 1620 1630 pF1KE2 MQKVMPTVPTSQVTGPPPQPPPIRREITFPPGSVEASQPVLKQRRKITSKDIVTPEAWRV :::. : ::.:... : ::: :::.::::::::::.::::::::..: ::: ::::::: CCDS44 MQKAGPPVPASHIAPAPVQPPMIRRDITFPPGSVEATQPVLKQRRRLTMKDIGTPEAWRV 1400 1410 1420 1430 1440 1450 1640 1650 1660 1670 1680 1690 pF1KE2 MMSLKSGLLAESTWALDTINILLYDDSTVATFNLSQLSGFLELLVEYFRKCLIDIFGILM ::::::::::::::::::::::::::... ::::::: :.:::::::::.:::.::::: CCDS44 MMSLKSGLLAESTWALDTINILLYDDNSIMTFNLSQLPGLLELLVEYFRRCLIEIFGILK 1460 1470 1480 1490 1500 1510 1700 1710 1720 1730 1740 1750 pF1KE2 EYEVGDPSQKALDHNAARKDDSQSLADDSGKEEEDAECIDDDEEDEEDEEEDSEKTESDE :::::::.:..: . .: . .: : : :::. : . :.::.:: .:.:: CCDS44 EYEVGDPGQRTL-LDPGRFSKVSSPAPMEGGEEEE-ELLGPKLEEEEEEEV----VENDE 1520 1530 1540 1550 1560 1760 1770 1780 1790 1800 1810 pF1KE2 KSSIALTAPDAAADPKEKPKQASKFDKLPIKIVKKNNLFVVDRSDKLGRVQEFNSGLLHW . ::... : :. . . : :::::::.:::.::. :::: ::::::::::.:::::: CCDS44 E--IAFSGKDKPASENSEEKLISKFDKLPVKIVQKNDPFVVDCSDKLGRVQEFDSGLLHW 1570 1580 1590 1600 1610 1620 1820 1830 1840 1850 1860 pF1KE2 QLGGGDTTEHIQTHFESKMEIPPRRRPPPPLSSAGRKK--EQEGKGDSEEQQ-------- ..:::::::::::::::: :. : : : : : ::. :: . .:. CCDS44 RIGGGDTTEHIQTHFESKTELLPSR-PHAPCPPAPRKHVTTAEGTPGTTDQEGPPPDGPP 1630 1640 1650 1660 1670 1680 1870 1880 1890 1900 1910 1920 pF1KE2 EKSIIATIDDVLSARPGALPEDANPGPQT--ESSKFPFGIQQAKSHRNIKLLEDEPRSRD :: : ::.::.::.: ..: ::. . .. :::::::::. :.::::::.:::::.:.: CCDS44 EKRITATMDDMLSTRSSTLTEDGAKSSEAIKESSKFPFGISPAQSHRNIKILEDEPHSKD 1690 1700 1710 1720 1730 1740 1930 1940 1950 1960 1970 1980 pF1KE2 ETPLCTIAHWQDSLAKRCICVSNIVRSLSFVPGNDAEMSKHPGLVLILGKLILLHHEHPE ::::::. :::::::::.:::: .:::::::::: ::::::::.:::::::::::.::: CCDS44 ETPLCTLLDWQDSLAKRCVCVSNTIRSLSFVPGNDFEMSKHPGLLLILGKLILLHHKHPE 1750 1760 1770 1780 1790 1800 1990 2000 2010 2020 2030 2040 pF1KE2 RKRAPQTYEKEEDEDKGVACSKDEWWWDCLEVLRDNTLVTLANISGQLDLSAYTESICLP ::.:: ::::::..:.::.:.: ::::::::.::.:::::::::::::::: : :::::: CCDS44 RKQAPLTYEKEEEQDQGVSCNKVEWWWDCLEMLRENTLVTLANISGQLDLSPYPESICLP 1810 1820 1830 1840 1850 1860 2050 2060 2070 2080 2090 2100 pF1KE2 ILDGLLHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLETLCKLSIQDNNVDLILATPPFSR .::::::: ::::::::::: :.:::.:::::::::::: :::::::::::::::::::: CCDS44 VLDGLLHWAVCPSAEAQDPFSTLGPNAVLSPQRLVLETLSKLSIQDNNVDLILATPPFSR 1870 1880 1890 1900 1910 1920 2110 2120 2130 2140 2150 2160 pF1KE2 QEKFYATLVRYVGDRKNPVCREMSMALLSNLAQGDALAARAIAVQKGSIGNLISFLEDGV ::.:.:.::...::::::::::...::.::::::.::::::::::::::::..::::.. CCDS44 LEKLYSTMVRFLSDRKNPVCREMAVVLLANLAQGDSLAARAIAVQKGSIGNLLGFLEDSL 1930 1940 1950 1960 1970 1980 2170 2180 2190 2200 2210 2220 pF1KE2 TMAQYQQSQHNLMHMQPPPLEPPSVDMMCRAAKALLAMARVDENRSEFLLHEGRLLDISI . .:.:::: .:.::: ::.:: ::::: :::.::::.:.::::.::: :.:.::::::. CCDS44 AATQFQQSQASLLHMQNPPFEPTSVDMMRRAARALLALAKVDENHSEFTLYESRLLDISV 1990 2000 2010 2020 2030 2040 2230 2240 pF1KE2 SAVLNSLVASVICDVLFQIGQL : ..::::..::::::: ::: CCDS44 SPLMNSLVSQVICDVLFLIGQS 2050 2060 2249 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Fri Nov 11 16:56:29 2016 done: Fri Nov 11 16:56:31 2016 Total Scan time: 9.650 Total Display time: 1.030 Function used was FASTA [36.3.4 Apr, 2011]