FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KE1600, 2231 aa
1>>>pF1KE1600 2231 - 2231 aa - 2231 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 15.5706+/-0.00132; mu= -18.8387+/- 0.080
mean_var=684.2039+/-139.801, 0's: 0 Z-trim(116.7): 148 B-trim: 0 in 0/53
Lambda= 0.049032
statistics sampled from 17192 (17322) to 17192 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.773), E-opt: 0.2 (0.532), width: 16
Scan time: 5.660
The best scores are: opt bits E(32554)
CCDS5251.2 ARID1B gene_id:57492|Hs108|chr6 (2236) 8763 636.7 3.5e-181
CCDS55072.1 ARID1B gene_id:57492|Hs108|chr6 (2249) 8750 635.8 6.6e-181
CCDS285.1 ARID1A gene_id:8289|Hs108|chr1 (2285) 3826 287.5 4.8e-76
CCDS44091.1 ARID1A gene_id:8289|Hs108|chr1 (2068) 3360 254.5 3.7e-66
>>CCDS5251.2 ARID1B gene_id:57492|Hs108|chr6 (2236 aa)
initn: 8264 init1: 8264 opt: 8763 Z-score: 3366.3 bits: 636.7 E(32554): 3.5e-181
Smith-Waterman score: 15072; 97.6% identity (97.6% similar) in 2231 aa overlap (1-2231:59-2236)
10 20 30
pF1KE1 METGLLPNHKLKTVGEAPAAPPHQQHHHHH
::::::::::::::::::::::::::::::
CCDS52 GSAAALSSSSSSSAAAAAASSSSSSGPGSAMETGLLPNHKLKTVGEAPAAPPHQQHHHHH
30 40 50 60 70 80
40 50 60 70 80 90
pF1KE1 HAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQQQQQQQQQQQQHPISNNNSLGGAGGGAP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 HAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQQQQQQQQQQQQHPISNNNSLGGAGGGAP
90 100 110 120 130 140
100 110 120 130 140 150
pF1KE1 QPGPDMEQPQHGGAKDSAAGGQADPPGPPLLSKPGDEDDAPPKMGEPAGGRYEHPGLGAL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 QPGPDMEQPQHGGAKDSAAGGQADPPGPPLLSKPGDEDDAPPKMGEPAGGRYEHPGLGAL
150 160 170 180 190 200
160 170 180 190 200 210
pF1KE1 GTQQPPVAVPGGGGGPAAVPEFNNYYGSAAPASGGPGGRAGPCFDQHGGQQSPGMGMMHS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 GTQQPPVAVPGGGGGPAAVPEFNNYYGSAAPASGGPGGRAGPCFDQHGGQQSPGMGMMHS
210 220 230 240 250 260
220 230 240 250 260 270
pF1KE1 ASAAAAGAPGSMDPLQNSHEGYPNSQCNHYPGYSRPGAGGGGGGGGGGGGGSGGGGGGGG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 ASAAAAGAPGSMDPLQNSHEGYPNSQCNHYPGYSRPGAGGGGGGGGGGGGGSGGGGGGGG
270 280 290 300 310 320
280 290 300 310 320 330
pF1KE1 AGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGGSSAGYGVLSSPRQQGGGMMMGPGGGGAA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 AGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGGSSAGYGVLSSPRQQGGGMMMGPGGGGAA
330 340 350 360 370 380
340 350 360 370 380 390
pF1KE1 SLSKAAAGSAAGGFQRFAGQNQHPSGATPTLNQLLTSPSPMMRSYGGSYPEYSSPSAPPP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 SLSKAAAGSAAGGFQRFAGQNQHPSGATPTLNQLLTSPSPMMRSYGGSYPEYSSPSAPPP
390 400 410 420 430 440
400 410 420 430 440 450
pF1KE1 PPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMGAQYAAASPAWAAAQQRSHPAMSPGTPGP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 PPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMGAQYAAASPAWAAAQQRSHPAMSPGTPGP
450 460 470 480 490 500
460 470 480 490 500 510
pF1KE1 TMGRSQGSPMDPMVMKRPQLYGMGSNPHSQPQQSSPYPGGSYGPPGPQRYPIGIQGRTPG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 TMGRSQGSPMDPMVMKRPQLYGMGSNPHSQPQQSSPYPGGSYGPPGPQRYPIGIQGRTPG
510 520 530 540 550 560
520 530 540 550 560 570
pF1KE1 AMAGMQYPQQQMPPQYGQQGVSGYCQQGQQPYYSQQPQPPHLPPQAQYLPSQSQQRYQPQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 AMAGMQYPQQQMPPQYGQQGVSGYCQQGQQPYYSQQPQPPHLPPQAQYLPSQSQQRYQPQ
570 580 590 600 610 620
580 590 600 610 620 630
pF1KE1 QDMSQEGYGTRSQPPLAPGKPNHEDLNLIQQERPSSLPDLSGSIDDLPTGTEATLSSAVS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 QDMSQEGYGTRSQPPLAPGKPNHEDLNLIQQERPSSLPDLSGSIDDLPTGTEATLSSAVS
630 640 650 660 670 680
640 650 660 670 680 690
pF1KE1 ASGSTSSQGDQSNPAQSPFSPHASPHLSSIPGGPSPSPVGSPVGSNQSRSGPISPASIPG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 ASGSTSSQGDQSNPAQSPFSPHASPHLSSIPGGPSPSPVGSPVGSNQSRSGPISPASIPG
690 700 710 720 730 740
700 710 720 730 740 750
pF1KE1 SQMPPQPPGSQSESSSHPALSQSPMPQERGFMAGTQRNPQMAQYGPQQTGPSMSPHPSPG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 SQMPPQPPGSQSESSSHPALSQSPMPQERGFMAGTQRNPQMAQYGPQQTGPSMSPHPSPG
750 760 770 780 790 800
760 770 780 790 800 810
pF1KE1 GQMHAGISSFQQSNSSGTYGPQMSQYGPQGNYSRPPAYSGVPSASYSGPGPGMGISANNQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 GQMHAGISSFQQSNSSGTYGPQMSQYGPQGNYSRPPAYSGVPSASYSGPGPGMGISANNQ
810 820 830 840 850 860
820 830 840 850 860 870
pF1KE1 MHGQGPSQPCGAVPLGRMPSAGMQNRPFPGNMSSMTPSSPGMSQQGGPGMGPPMPTVNRK
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 MHGQGPSQPCGAVPLGRMPSAGMQNRPFPGNMSSMTPSSPGMSQQGGPGMGPPMPTVNRK
870 880 890 900 910 920
880 890 900 910 920 930
pF1KE1 AQEAAAAVMQAAANSAQSRQGSFPGMNQSGLMASSSPYSQPMNNSSSLMNTQAPPYSMAP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 AQEAAAAVMQAAANSAQSRQGSFPGMNQSGLMASSSPYSQPMNNSSSLMNTQAPPYSMAP
930 940 950 960 970 980
940 950 960 970 980 990
pF1KE1 AMVNSSAASVGLADMMSPGESKLPLPLKADGKEEGTPQPESKSKDSYSSQGISQPPTPGN
::::::::::::::::::::::::::::::::::::::::::::
CCDS52 AMVNSSAASVGLADMMSPGESKLPLPLKADGKEEGTPQPESKSK----------------
990 1000 1010 1020 1030
1000 1010 1020 1030 1040 1050
pF1KE1 LPVPSPMSPSSASISSFHGDESDSISSPGWPKTPSSPKSSSSTTTGEKITKVYELGNEPE
:::::::::::::::::::::::
CCDS52 -------------------------------------KSSSSTTTGEKITKVYELGNEPE
1040 1050
1060 1070 1080 1090 1100 1110
pF1KE1 RKLWVDRYLTFMEERGSPVSSLPAVGKKPLDLFRLYVCVKEIGGLAQVNKNKKWRELATN
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 RKLWVDRYLTFMEERGSPVSSLPAVGKKPLDLFRLYVCVKEIGGLAQVNKNKKWRELATN
1060 1070 1080 1090 1100 1110
1120 1130 1140 1150 1160 1170
pF1KE1 LNVGTSSSAASSLKKQYIQYLFAFECKIERGEEPPPEVFSTGDTKKQPKLQPPSPANSGS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 LNVGTSSSAASSLKKQYIQYLFAFECKIERGEEPPPEVFSTGDTKKQPKLQPPSPANSGS
1120 1130 1140 1150 1160 1170
1180 1190 1200 1210 1220 1230
pF1KE1 LQGPQTPQSTGSNSMAEVPGDLKPPTPASTPHGQMTPMQGGRSSTISVHDPFSDVSDSSF
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 LQGPQTPQSTGSNSMAEVPGDLKPPTPASTPHGQMTPMQGGRSSTISVHDPFSDVSDSSF
1180 1190 1200 1210 1220 1230
1240 1250 1260 1270 1280 1290
pF1KE1 PKRNSMTPNAPYQQGMSMPDVMGRMPYEPNKDPFGGMRKVPGSSEPFMTQGQMPNSSMQD
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 PKRNSMTPNAPYQQGMSMPDVMGRMPYEPNKDPFGGMRKVPGSSEPFMTQGQMPNSSMQD
1240 1250 1260 1270 1280 1290
1300 1310 1320 1330 1340 1350
pF1KE1 MYNQSPSGAMSNLGMGQRQQFPYGASYDRRHEPYGQQYPGQGPPSGQPPYGGHQPGLYPQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 MYNQSPSGAMSNLGMGQRQQFPYGASYDRRHEPYGQQYPGQGPPSGQPPYGGHQPGLYPQ
1300 1310 1320 1330 1340 1350
1360 1370 1380 1390 1400 1410
pF1KE1 QPNYKRHMDGMYGPPAKRHEGDMYNMQYSSQQQEMYNQYGGSYSGPDRRPIQGQYPYPYS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 QPNYKRHMDGMYGPPAKRHEGDMYNMQYSSQQQEMYNQYGGSYSGPDRRPIQGQYPYPYS
1360 1370 1380 1390 1400 1410
1420 1430 1440 1450 1460 1470
pF1KE1 RERMQGPGQIQTHGIPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPYQNRQGPGGPTQAP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 RERMQGPGQIQTHGIPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPYQNRQGPGGPTQAP
1420 1430 1440 1450 1460 1470
1480 1490 1500 1510 1520 1530
pF1KE1 PYPGMNRTDDMMVPDQRINHESQWPSHVSQRQPYMSSSASMQPITRPPQPSYQTPPSLPN
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 PYPGMNRTDDMMVPDQRINHESQWPSHVSQRQPYMSSSASMQPITRPPQPSYQTPPSLPN
1480 1490 1500 1510 1520 1530
1540 1550 1560 1570 1580 1590
pF1KE1 HISRAPSPASFQRSLENRMSPSKSPFLPSMKMQKVMPTVPTSQVTGPPPQPPPIRREITF
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 HISRAPSPASFQRSLENRMSPSKSPFLPSMKMQKVMPTVPTSQVTGPPPQPPPIRREITF
1540 1550 1560 1570 1580 1590
1600 1610 1620 1630 1640 1650
pF1KE1 PPGSVEASQPVLKQRRKITSKDIVTPEAWRVMMSLKSGLLAESTWALDTINILLYDDSTV
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 PPGSVEASQPVLKQRRKITSKDIVTPEAWRVMMSLKSGLLAESTWALDTINILLYDDSTV
1600 1610 1620 1630 1640 1650
1660 1670 1680 1690 1700 1710
pF1KE1 ATFNLSQLSGFLELLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNAARKDDSQSLADDS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 ATFNLSQLSGFLELLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNAARKDDSQSLADDS
1660 1670 1680 1690 1700 1710
1720 1730 1740 1750 1760 1770
pF1KE1 GKEEEDAECIDDDEEDEEDEEEDSEKTESDEKSSIALTAPDAAADPKEKPKQASKFDKLP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 GKEEEDAECIDDDEEDEEDEEEDSEKTESDEKSSIALTAPDAAADPKEKPKQASKFDKLP
1720 1730 1740 1750 1760 1770
1780 1790 1800 1810 1820 1830
pF1KE1 IKIVKKNNLFVVDRSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFESKMEIPPRRRPPP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 IKIVKKNNLFVVDRSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFESKMEIPPRRRPPP
1780 1790 1800 1810 1820 1830
1840 1850 1860 1870 1880 1890
pF1KE1 PLSSAGRKKEQEGKGDSEEQQEKSIIATIDDVLSARPGALPEDANPGPQTESSKFPFGIQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 PLSSAGRKKEQEGKGDSEEQQEKSIIATIDDVLSARPGALPEDANPGPQTESSKFPFGIQ
1840 1850 1860 1870 1880 1890
1900 1910 1920 1930 1940 1950
pF1KE1 QAKSHRNIKLLEDEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSLSFVPGNDAEMSKH
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 QAKSHRNIKLLEDEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSLSFVPGNDAEMSKH
1900 1910 1920 1930 1940 1950
1960 1970 1980 1990 2000 2010
pF1KE1 PGLVLILGKLILLHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWDCLEVLRDNTLVTL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 PGLVLILGKLILLHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWDCLEVLRDNTLVTL
1960 1970 1980 1990 2000 2010
2020 2030 2040 2050 2060 2070
pF1KE1 ANISGQLDLSAYTESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLETLCK
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 ANISGQLDLSAYTESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLETLCK
2020 2030 2040 2050 2060 2070
2080 2090 2100 2110 2120 2130
pF1KE1 LSIQDNNVDLILATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALLSNLAQGDALAARA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 LSIQDNNVDLILATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALLSNLAQGDALAARA
2080 2090 2100 2110 2120 2130
2140 2150 2160 2170 2180 2190
pF1KE1 IAVQKGSIGNLISFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMMCRAAKALLAMARV
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS52 IAVQKGSIGNLISFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMMCRAAKALLAMARV
2140 2150 2160 2170 2180 2190
2200 2210 2220 2230
pF1KE1 DENRSEFLLHEGRLLDISISAVLNSLVASVICDVLFQIGQL
:::::::::::::::::::::::::::::::::::::::::
CCDS52 DENRSEFLLHEGRLLDISISAVLNSLVASVICDVLFQIGQL
2200 2210 2220 2230
>>CCDS55072.1 ARID1B gene_id:57492|Hs108|chr6 (2249 aa)
initn: 12015 init1: 8264 opt: 8750 Z-score: 3361.3 bits: 635.8 E(32554): 6.6e-181
Smith-Waterman score: 15036; 97.1% identity (97.1% similar) in 2244 aa overlap (1-2231:59-2249)
10 20 30
pF1KE1 METGLLPNHKLKTVGEAPAAPPHQQHHHHH
::::::::::::::::::::::::::::::
CCDS55 GSAAALSSSSSSSAAAAAASSSSSSGPGSAMETGLLPNHKLKTVGEAPAAPPHQQHHHHH
30 40 50 60 70 80
40 50 60 70 80 90
pF1KE1 HAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQQQQQQQQQQQQHPISNNNSLGGAGGGAP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 HAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQQQQQQQQQQQQHPISNNNSLGGAGGGAP
90 100 110 120 130 140
100 110 120 130 140 150
pF1KE1 QPGPDMEQPQHGGAKDSAAGGQADPPGPPLLSKPGDEDDAPPKMGEPAGGRYEHPGLGAL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 QPGPDMEQPQHGGAKDSAAGGQADPPGPPLLSKPGDEDDAPPKMGEPAGGRYEHPGLGAL
150 160 170 180 190 200
160 170 180 190 200 210
pF1KE1 GTQQPPVAVPGGGGGPAAVPEFNNYYGSAAPASGGPGGRAGPCFDQHGGQQSPGMGMMHS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 GTQQPPVAVPGGGGGPAAVPEFNNYYGSAAPASGGPGGRAGPCFDQHGGQQSPGMGMMHS
210 220 230 240 250 260
220 230 240 250 260 270
pF1KE1 ASAAAAGAPGSMDPLQNSHEGYPNSQCNHYPGYSRPGAGGGGGGGGGGGGGSGGGGGGGG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 ASAAAAGAPGSMDPLQNSHEGYPNSQCNHYPGYSRPGAGGGGGGGGGGGGGSGGGGGGGG
270 280 290 300 310 320
280 290 300 310 320 330
pF1KE1 AGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGGSSAGYGVLSSPRQQGGGMMMGPGGGGAA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 AGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGGSSAGYGVLSSPRQQGGGMMMGPGGGGAA
330 340 350 360 370 380
340 350 360 370 380 390
pF1KE1 SLSKAAAGSAAGGFQRFAGQNQHPSGATPTLNQLLTSPSPMMRSYGGSYPEYSSPSAPPP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 SLSKAAAGSAAGGFQRFAGQNQHPSGATPTLNQLLTSPSPMMRSYGGSYPEYSSPSAPPP
390 400 410 420 430 440
400 410 420 430 440 450
pF1KE1 PPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMGAQYAAASPAWAAAQQRSHPAMSPGTPGP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 PPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMGAQYAAASPAWAAAQQRSHPAMSPGTPGP
450 460 470 480 490 500
460 470 480 490 500 510
pF1KE1 TMGRSQGSPMDPMVMKRPQLYGMGSNPHSQPQQSSPYPGGSYGPPGPQRYPIGIQGRTPG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 TMGRSQGSPMDPMVMKRPQLYGMGSNPHSQPQQSSPYPGGSYGPPGPQRYPIGIQGRTPG
510 520 530 540 550 560
520 530 540 550
pF1KE1 AMAGMQYPQQQ-------------MPPQYGQQGVSGYCQQGQQPYYSQQPQPPHLPPQAQ
::::::::::: ::::::::::::::::::::::::::::::::::::
CCDS55 AMAGMQYPQQQDSGDATWKETFWLMPPQYGQQGVSGYCQQGQQPYYSQQPQPPHLPPQAQ
570 580 590 600 610 620
560 570 580 590 600 610
pF1KE1 YLPSQSQQRYQPQQDMSQEGYGTRSQPPLAPGKPNHEDLNLIQQERPSSLPDLSGSIDDL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 YLPSQSQQRYQPQQDMSQEGYGTRSQPPLAPGKPNHEDLNLIQQERPSSLPDLSGSIDDL
630 640 650 660 670 680
620 630 640 650 660 670
pF1KE1 PTGTEATLSSAVSASGSTSSQGDQSNPAQSPFSPHASPHLSSIPGGPSPSPVGSPVGSNQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 PTGTEATLSSAVSASGSTSSQGDQSNPAQSPFSPHASPHLSSIPGGPSPSPVGSPVGSNQ
690 700 710 720 730 740
680 690 700 710 720 730
pF1KE1 SRSGPISPASIPGSQMPPQPPGSQSESSSHPALSQSPMPQERGFMAGTQRNPQMAQYGPQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 SRSGPISPASIPGSQMPPQPPGSQSESSSHPALSQSPMPQERGFMAGTQRNPQMAQYGPQ
750 760 770 780 790 800
740 750 760 770 780 790
pF1KE1 QTGPSMSPHPSPGGQMHAGISSFQQSNSSGTYGPQMSQYGPQGNYSRPPAYSGVPSASYS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 QTGPSMSPHPSPGGQMHAGISSFQQSNSSGTYGPQMSQYGPQGNYSRPPAYSGVPSASYS
810 820 830 840 850 860
800 810 820 830 840 850
pF1KE1 GPGPGMGISANNQMHGQGPSQPCGAVPLGRMPSAGMQNRPFPGNMSSMTPSSPGMSQQGG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 GPGPGMGISANNQMHGQGPSQPCGAVPLGRMPSAGMQNRPFPGNMSSMTPSSPGMSQQGG
870 880 890 900 910 920
860 870 880 890 900 910
pF1KE1 PGMGPPMPTVNRKAQEAAAAVMQAAANSAQSRQGSFPGMNQSGLMASSSPYSQPMNNSSS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 PGMGPPMPTVNRKAQEAAAAVMQAAANSAQSRQGSFPGMNQSGLMASSSPYSQPMNNSSS
930 940 950 960 970 980
920 930 940 950 960 970
pF1KE1 LMNTQAPPYSMAPAMVNSSAASVGLADMMSPGESKLPLPLKADGKEEGTPQPESKSKDSY
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 LMNTQAPPYSMAPAMVNSSAASVGLADMMSPGESKLPLPLKADGKEEGTPQPESKSK---
990 1000 1010 1020 1030 1040
980 990 1000 1010 1020 1030
pF1KE1 SSQGISQPPTPGNLPVPSPMSPSSASISSFHGDESDSISSPGWPKTPSSPKSSSSTTTGE
::::::::::
CCDS55 --------------------------------------------------KSSSSTTTGE
1050
1040 1050 1060 1070 1080 1090
pF1KE1 KITKVYELGNEPERKLWVDRYLTFMEERGSPVSSLPAVGKKPLDLFRLYVCVKEIGGLAQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 KITKVYELGNEPERKLWVDRYLTFMEERGSPVSSLPAVGKKPLDLFRLYVCVKEIGGLAQ
1060 1070 1080 1090 1100 1110
1100 1110 1120 1130 1140 1150
pF1KE1 VNKNKKWRELATNLNVGTSSSAASSLKKQYIQYLFAFECKIERGEEPPPEVFSTGDTKKQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 VNKNKKWRELATNLNVGTSSSAASSLKKQYIQYLFAFECKIERGEEPPPEVFSTGDTKKQ
1120 1130 1140 1150 1160 1170
1160 1170 1180 1190 1200 1210
pF1KE1 PKLQPPSPANSGSLQGPQTPQSTGSNSMAEVPGDLKPPTPASTPHGQMTPMQGGRSSTIS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 PKLQPPSPANSGSLQGPQTPQSTGSNSMAEVPGDLKPPTPASTPHGQMTPMQGGRSSTIS
1180 1190 1200 1210 1220 1230
1220 1230 1240 1250 1260 1270
pF1KE1 VHDPFSDVSDSSFPKRNSMTPNAPYQQGMSMPDVMGRMPYEPNKDPFGGMRKVPGSSEPF
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 VHDPFSDVSDSSFPKRNSMTPNAPYQQGMSMPDVMGRMPYEPNKDPFGGMRKVPGSSEPF
1240 1250 1260 1270 1280 1290
1280 1290 1300 1310 1320 1330
pF1KE1 MTQGQMPNSSMQDMYNQSPSGAMSNLGMGQRQQFPYGASYDRRHEPYGQQYPGQGPPSGQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 MTQGQMPNSSMQDMYNQSPSGAMSNLGMGQRQQFPYGASYDRRHEPYGQQYPGQGPPSGQ
1300 1310 1320 1330 1340 1350
1340 1350 1360 1370 1380 1390
pF1KE1 PPYGGHQPGLYPQQPNYKRHMDGMYGPPAKRHEGDMYNMQYSSQQQEMYNQYGGSYSGPD
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 PPYGGHQPGLYPQQPNYKRHMDGMYGPPAKRHEGDMYNMQYSSQQQEMYNQYGGSYSGPD
1360 1370 1380 1390 1400 1410
1400 1410 1420 1430 1440 1450
pF1KE1 RRPIQGQYPYPYSRERMQGPGQIQTHGIPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPY
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 RRPIQGQYPYPYSRERMQGPGQIQTHGIPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPY
1420 1430 1440 1450 1460 1470
1460 1470 1480 1490 1500 1510
pF1KE1 QNRQGPGGPTQAPPYPGMNRTDDMMVPDQRINHESQWPSHVSQRQPYMSSSASMQPITRP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 QNRQGPGGPTQAPPYPGMNRTDDMMVPDQRINHESQWPSHVSQRQPYMSSSASMQPITRP
1480 1490 1500 1510 1520 1530
1520 1530 1540 1550 1560 1570
pF1KE1 PQPSYQTPPSLPNHISRAPSPASFQRSLENRMSPSKSPFLPSMKMQKVMPTVPTSQVTGP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 PQPSYQTPPSLPNHISRAPSPASFQRSLENRMSPSKSPFLPSMKMQKVMPTVPTSQVTGP
1540 1550 1560 1570 1580 1590
1580 1590 1600 1610 1620 1630
pF1KE1 PPQPPPIRREITFPPGSVEASQPVLKQRRKITSKDIVTPEAWRVMMSLKSGLLAESTWAL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 PPQPPPIRREITFPPGSVEASQPVLKQRRKITSKDIVTPEAWRVMMSLKSGLLAESTWAL
1600 1610 1620 1630 1640 1650
1640 1650 1660 1670 1680 1690
pF1KE1 DTINILLYDDSTVATFNLSQLSGFLELLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 DTINILLYDDSTVATFNLSQLSGFLELLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNA
1660 1670 1680 1690 1700 1710
1700 1710 1720 1730 1740 1750
pF1KE1 ARKDDSQSLADDSGKEEEDAECIDDDEEDEEDEEEDSEKTESDEKSSIALTAPDAAADPK
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 ARKDDSQSLADDSGKEEEDAECIDDDEEDEEDEEEDSEKTESDEKSSIALTAPDAAADPK
1720 1730 1740 1750 1760 1770
1760 1770 1780 1790 1800 1810
pF1KE1 EKPKQASKFDKLPIKIVKKNNLFVVDRSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 EKPKQASKFDKLPIKIVKKNNLFVVDRSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFE
1780 1790 1800 1810 1820 1830
1820 1830 1840 1850 1860 1870
pF1KE1 SKMEIPPRRRPPPPLSSAGRKKEQEGKGDSEEQQEKSIIATIDDVLSARPGALPEDANPG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 SKMEIPPRRRPPPPLSSAGRKKEQEGKGDSEEQQEKSIIATIDDVLSARPGALPEDANPG
1840 1850 1860 1870 1880 1890
1880 1890 1900 1910 1920 1930
pF1KE1 PQTESSKFPFGIQQAKSHRNIKLLEDEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 PQTESSKFPFGIQQAKSHRNIKLLEDEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSL
1900 1910 1920 1930 1940 1950
1940 1950 1960 1970 1980 1990
pF1KE1 SFVPGNDAEMSKHPGLVLILGKLILLHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWD
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 SFVPGNDAEMSKHPGLVLILGKLILLHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWD
1960 1970 1980 1990 2000 2010
2000 2010 2020 2030 2040 2050
pF1KE1 CLEVLRDNTLVTLANISGQLDLSAYTESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSV
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 CLEVLRDNTLVTLANISGQLDLSAYTESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSV
2020 2030 2040 2050 2060 2070
2060 2070 2080 2090 2100 2110
pF1KE1 LSPQRLVLETLCKLSIQDNNVDLILATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 LSPQRLVLETLCKLSIQDNNVDLILATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALL
2080 2090 2100 2110 2120 2130
2120 2130 2140 2150 2160 2170
pF1KE1 SNLAQGDALAARAIAVQKGSIGNLISFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMM
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 SNLAQGDALAARAIAVQKGSIGNLISFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMM
2140 2150 2160 2170 2180 2190
2180 2190 2200 2210 2220 2230
pF1KE1 CRAAKALLAMARVDENRSEFLLHEGRLLDISISAVLNSLVASVICDVLFQIGQL
::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS55 CRAAKALLAMARVDENRSEFLLHEGRLLDISISAVLNSLVASVICDVLFQIGQL
2200 2210 2220 2230 2240
>>CCDS285.1 ARID1A gene_id:8289|Hs108|chr1 (2285 aa)
initn: 4735 init1: 1919 opt: 3826 Z-score: 1478.8 bits: 287.5 E(32554): 4.8e-76
Smith-Waterman score: 7127; 51.1% identity (68.5% similar) in 2409 aa overlap (58-2230:25-2284)
30 40 50 60 70 80
pF1KE1 HHHHAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQQQQQQQQQQQQHPISNNNSLGGAGG
.. .:::... . . . . :..
CCDS28 MAAQVAPAAASSLGNPPPPPPSELKKAEQQQREEAGGEAAAAAAAERGEMKAAA
10 20 30 40 50
90 100 110 120 130 140
pF1KE1 GAPQPGPDMEQPQHGGAKDSAAGGQADPPGPPLLSKPGDEDDAPPKMGEPAGGRYEHPGL
: . :: . :: : :. :.... : . : : : . . :. .:.:
CCDS28 GQESEGPAVGPPQPLG-KELQDGAESNGGGGGGGAGSGGGPGAEPDLKNSNGNAGPRPAL
60 70 80 90 100 110
150 160 170 180 190
pF1KE1 GALGTQQPPVAVPGGGGGPA---AVPEFNNYYGSAAPASG-G-PGGR--------AGPCF
. . .:: ::::: . ..: . . :: : : : :: :. :
CCDS28 NN-NLTEPP---GGGGGGSSDGVGAPPHSAAAALPPPAYGFGQPYGRSPSAVAAAAAAVF
120 130 140 150 160
200 210 220 230 240 250
pF1KE1 -DQHGGQQSPGMGMMHSASAAAAGAPGSMDPLQNSHE-GYPNSQCNHYPGYSRPGAGGGG
.:::::::::.. ..:... .: : ::::. :.:: : : : : .:
CCDS28 HQQHGGQQSPGLAALQSGGG--GGLEPYAGPQQNSHDHGFPNHQYNSY--YPNRSAYPPP
170 180 190 200 210 220
260 270 280 290 300 310
pF1KE1 GGGGGGGGGSGGGGGGGGAGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGGSSAGYGVLSS
. . . .. :: :.:.:.:.:. ..:.:.... ::
CCDS28 APAYALSSPRGGTPGSGAAAAAGSKPPPSSSASASSSS--------------------SS
230 240 250 260
320 330 340 350 360 370
pF1KE1 PRQQGGGMMMGPGGGGAASLSKAAAGSAAGGFQRFAGQNQHPSGATPTLNQLLTSPSPMM
:: : : :::: ::::: : : :. :::::::::::::
CCDS28 FAQQRFGAM---GGGGP---------SAAGG-----GTPQ-PT-ATPTLNQLLTSPSSA-
270 280 290 300
380 390 400 410 420 430
pF1KE1 RSYGGSYP--EYSSPSAPPPPPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMGAQYAAASP
:.: : :: .::. ::. .:. : : ..:. : .: :::.
CCDS28 RGYQG-YPGGDYSGG---------PQDGGAGKGPA---DMASQCWG----AAAAAAAAAA
310 320 330 340
440 450 460 470 480
pF1KE1 AWAAAQQRSHPA-MSPGTPG----PTMGRSQ-GSPMDPMVMKRPQLYGMGSNPHSQ----
: ..:::::: : ::::. : : : .:::: : ::: :: :.::.::
CCDS28 ASGGAQQRSHHAPMSPGSSGGGGQPLARTPQPSSPMDQMGKMRPQPYG-GTNPYSQQQGP
350 360 370 380 390 400
490 500 510 520 530
pF1KE1 ---PQQSSPYPGGSYGPPGPQRYPIGIQGRTPGAMAGMQYPQQQMPPQYGQQGVSGYCQQ
:::. ::: :: :::::. .:::. .::.:..: :: .:: ::::: ::: ::
CCDS28 PSGPQQGHGYPGQPYGSQTPQRYPMTMQGRAQSAMGGLSYTQQ-IPP-YGQQGPSGYGQQ
410 420 430 440 450 460
540 550
pF1KE1 GQQPYY--------------SQQP-----------------QPPHL----PPQAQY----
:: ::: :::: :::.: :: .:
CCDS28 GQTPYYNQQSPHPQQQQPPYSQQPPSQTPHAQPSYQQQPQSQPPQLQSSQPPYSQQPSQP
470 480 490 500 510 520
560
pF1KE1 --------LPSQ------------------------------------------------
:::
CCDS28 PHQQSPAPYPSQQSTTQQHPQSQPPYSQPQAQSPYQQQQPQQPAPSTLSQQAAYPQPQSQ
530 540 550 560 570 580
570 580 590 600 610
pF1KE1 -------SQQRYQPQQDMSQEGYGTR--SQPPLAPGKPNHEDLNLIQQERPSSLPDLSGS
::::. : :..::...:.. : : .. .: ..::.:: : :::::::::::
CCDS28 QSQQTAYSQQRFPPPQELSQDSFGSQASSAPSMTSSKGGQEDMNLSLQSRPSSLPDLSGS
590 600 610 620 630 640
620 630 640 650 660 670
pF1KE1 IDDLPTGTEATLSSAVSASGSTSSQGDQSNPAQSPFSPHASPHLSSIPGGPSPSPVGSPV
::::: :::..:: .::.:: .::::.::::::::::::.:::: .: : :::::::::.
CCDS28 IDDLPMGTEGALSPGVSTSGISSSQGEQSNPAQSPFSPHTSPHLPGIRG-PSPSPVGSPA
650 660 670 680 690 700
680 690 700 710 720 730
pF1KE1 GSNQSRSGPISPASIPGSQMPPQPPGSQSESSSHPALSQSPMPQERGFMAGTQRNPQMAQ
. ::::::.:::..::.::::.::..::.: ::...:: . :.::.: :::::: :
CCDS28 SVAQSRSGPLSPAAVPGNQMPPRPPSGQSDSIMHPSMNQSSIAQDRGYM---QRNPQMPQ
710 720 730 740 750 760
740 750 760 770 780 790
pF1KE1 YGPQQTGPSMSPHPSPGGQMHAGISSFQQSNSSGTYGPQMSQYGPQGNYSRPPAYSGVPS
:. : : ..::. :::.:.:..:.:: :: :.:::: .::::::.: : : :...:.
CCDS28 YSSPQPGSALSPRQPSGGQIHTGMGSYQQ-NSMGSYGPQGGQYGPQGGYPRQPNYNALPN
770 780 790 800 810 820
800 810 820 830 840 850
pF1KE1 ASYSGPGPGMGIS---ANNQMHGQGPSQPCGAVPLGRMPSAGMQNRPFPGNMSSMTPSSP
:.: . : . ::. :..::::: : :..: ::: :.: :::. ::..: :
CCDS28 ANYPSAGMAGGINPMGAGGQMHGQPGIPPYGTLPPGRMSHASMGNRPYGPNMANMPP---
830 840 850 860 870
860 870 880 890 900 910
pF1KE1 GMSQQGGPGMGPPMPTVNRKAQEAAAAVMQAAANSAQSRQGSFPGMNQSGLMASSSPYSQ
: : :: :: .:::.::.:.: :..:::: :.: ..:.:::.:.:... ::.:
CCDS28 ----QVGSGMCPPPGGMNRKTQETAVA-MHVAANSIQNRPPGYPNMNQGGMMGTGPPYGQ
880 890 900 910 920 930
920 930 940 950 960 970
pF1KE1 PMNNSSSLMNTQAPPYSMAPAMVNSSAASVGLADMMSPGESKLPLPLKADGKEEGTPQPE
.:. ....: :.:::::. .:.:.::. .. .::. :. :: : ..: .:::. :
CCDS28 GINSMAGMINPQGPPYSMGGTMANNSAGMAASPEMMGLGDVKLTPATKMNNKADGTPKTE
940 950 960 970 980 990
980 990 1000 1010 1020 1030
pF1KE1 SKSKDSYSSQGISQPPTPGNLPVPSPMSPSSASISSFHGDESDSISSPGWPKTPSSPKSS
:::: :::
CCDS28 SKSK-----------------------------------------------------KSS
1040 1050 1060 1070 1080 1090
pF1KE1 SSTTTGEKITKVYELGNEPERKLWVDRYLTFMEERGSPVSSLPAVGKKPLDLFRLYVCVK
:::::.:::::.::::.:::::.::::::.: ::.. ...:::::.:::::.:::: ::
CCDS28 SSTTTNEKITKLYELGGEPERKMWVDRYLAFTEEKAMGMTNLPAVGRKPLDLYRLYVSVK
1000 1010 1020 1030 1040 1050
1100 1110 1120 1130 1140 1150
pF1KE1 EIGGLAQVNKNKKWRELATNLNVGTSSSAASSLKKQYIQYLFAFECKIERGEEPPPEVFS
:::::.::::::::::::::::::::::::::::::::: :.::::::::::.:::..:.
CCDS28 EIGGLTQVNKNKKWRELATNLNVGTSSSAASSLKKQYIQCLYAFECKIERGEDPPPDIFA
1060 1070 1080 1090 1100 1110
1160 1170 1180 1190 1200
pF1KE1 TGDTKK-QPKLQPPSPANSGSLQGPQTPQSTGSNSMAEVPGDLKPPTPASTPHGQMTPMQ
..:.:: :::.::::::.:::.::::::::: :.:::: :::::::::::::.:. :.
CCDS28 AADSKKSQPKIQPPSPAGSGSMQGPQTPQST-SSSMAE-GGDLKPPTPASTPHSQIPPLP
1120 1130 1140 1150 1160 1170
1210 1220 1230 1240 1250 1260
pF1KE1 G-GRSSTISVHDPFSDVSDSSFPKRNSMTPNAPYQQGMSMPDVMGRMPYEPNKDPFGGMR
: .::......: :.: :::.: :::::::: :: .:. :.:::: :::::::.:.::
CCDS28 GMSRSNSVGIQDAFNDGSDSTFQKRNSMTPNPGYQPSMNTSDMMGRMSYEPNKDPYGSMR
1180 1190 1200 1210 1220 1230
1270 1280 1290 1300 1310
pF1KE1 KVPGSSEPFMTQGQMPNSSMQDMYNQSPSGAMSNLGMGQRQQFPYGASYDR---------
:.::: .:::..:: ::..: : :... . ...:..:: ::..:::. :::
CCDS28 KAPGS-DPFMSSGQGPNGGMGDPYSRAAGPGLGNVAMGPRQHYPYGGPYDRVRTEPGIGP
1240 1250 1260 1270 1280 1290
1320 1330 1340
pF1KE1 --------------------------------------RHEPYGQQYPGQGPPSGQPPYG
::. ::.:. :: :::.: .
CCDS28 EGNMSTGAPQPNLMPSNPDSGMYSPSRYPPQQQQQQQQRHDSYGNQFSTQGTPSGSP-FP
1300 1310 1320 1330 1340 1350
1350 1360 1370
pF1KE1 GHQPGLYPQQP-NYKRHMDGMYGPPAKRHEGDMYNMQYS---------------------
..: .: :: :::: ::: ::::::::::.::.. ::
CCDS28 SQQTTMYQQQQQNYKRPMDGTYGPPAKRHEGEMYSVPYSTGQGQPQQQQLPPAQPQPASQ
1360 1370 1380 1390 1400 1410
1380 1390 1400 1410 1420
pF1KE1 ------SQQQEMYNQYGGSY-----SGPDRRPI---QGQYPYPYSRERMQGP-GQIQTHG
: ::..:::::..: .. .::: :.:.:. ..:.:...: : ..
CCDS28 QQAAQPSPQQDVYNQYGNAYPATATAATERRPAGGPQNQFPFQFGRDRVSAPPGTNAQQN
1420 1430 1440 1450 1460 1470
1430 1440 1450 1460 1470 1480
pF1KE1 IPPQMMGGPLQSSSSEGPQQNMWAARNDMPYPYQNRQGPGGPTQAPPYPGMNRTDDMMVP
.::::::::.:.:. . : .:: .:::: : : :::. :. :.: : :.::::.:.
CCDS28 MPPQMMGGPIQASAEVAQQGTMWQGRNDMTYNYANRQSTGSAPQGPAYHGVNRTDEMLHT
1480 1490 1500 1510 1520 1530
1490 1500 1510 1520 1530 1540
pF1KE1 DQRINHESQWPSHVSQRQPYMSSSASMQPITRPPQPSYQTPPSLPNHISRAPSPASFQRS
::: :::..:::: . ::: .. :: . :.:::: .:: :::. ::: .. ::: . :
CCDS28 DQRANHEGSWPSH-GTRQPPYGPSAPVPPMTRPPPSNYQPPPSMQNHIPQVSSPAPLPRP
1540 1550 1560 1570 1580 1590
1550 1560 1570 1580 1590 1600
pF1KE1 LENRMSPSKSPFLPS-MKMQKVMPTVPTSQVTGPPPQPPPIRREITFPPGSVEASQPVLK
.::: :::::::: : :::::. : ::.:... : ::: :::.::::::::::.:::::
CCDS28 MENRTSPSKSPFLHSGMKMQKAGPPVPASHIAPAPVQPPMIRRDITFPPGSVEATQPVLK
1600 1610 1620 1630 1640 1650
1610 1620 1630 1640 1650 1660
pF1KE1 QRRKITSKDIVTPEAWRVMMSLKSGLLAESTWALDTINILLYDDSTVATFNLSQLSGFLE
:::..: ::: :::::::::::::::::::::::::::::::::... ::::::: :.::
CCDS28 QRRRLTMKDIGTPEAWRVMMSLKSGLLAESTWALDTINILLYDDNSIMTFNLSQLPGLLE
1660 1670 1680 1690 1700 1710
1670 1680 1690 1700 1710 1720
pF1KE1 LLVEYFRKCLIDIFGILMEYEVGDPSQKALDHNAARKDDSQSLADDSGKEEEDAECIDDD
:::::::.:::.::::: :::::::.:..: . .: . .: : : :::. : .
CCDS28 LLVEYFRRCLIEIFGILKEYEVGDPGQRTL-LDPGRFSKVSSPAPMEGGEEEE-ELLGPK
1720 1730 1740 1750 1760 1770
1730 1740 1750 1760 1770 1780
pF1KE1 EEDEEDEEEDSEKTESDEKSSIALTAPDAAADPKEKPKQASKFDKLPIKIVKKNNLFVVD
:.::.:: .:.::. ::... : :. . . : :::::::.:::.::. ::::
CCDS28 LEEEEEEEV----VENDEE--IAFSGKDKPASENSEEKLISKFDKLPVKIVQKNDPFVVD
1780 1790 1800 1810 1820
1790 1800 1810 1820 1830 1840
pF1KE1 RSDKLGRVQEFNSGLLHWQLGGGDTTEHIQTHFESKMEIPPRRRPPPPLSSAGRKK--EQ
::::::::::.::::::..:::::::::::::::: :. : : : : : ::.
CCDS28 CSDKLGRVQEFDSGLLHWRIGGGDTTEHIQTHFESKTELLPSR-PHAPCPPAPRKHVTTA
1830 1840 1850 1860 1870 1880
1850 1860 1870 1880 1890
pF1KE1 EGKGDSEEQQ--------EKSIIATIDDVLSARPGALPEDANPGPQT--ESSKFPFGIQQ
:: . .:. :: : ::.::.::.: ..: ::. . .. :::::::::.
CCDS28 EGTPGTTDQEGPPPDGPPEKRITATMDDMLSTRSSTLTEDGAKSSEAIKESSKFPFGISP
1890 1900 1910 1920 1930 1940
1900 1910 1920 1930 1940 1950
pF1KE1 AKSHRNIKLLEDEPRSRDETPLCTIAHWQDSLAKRCICVSNIVRSLSFVPGNDAEMSKHP
:.::::::.:::::.:.:::::::. :::::::::.:::: .:::::::::: ::::::
CCDS28 AQSHRNIKILEDEPHSKDETPLCTLLDWQDSLAKRCVCVSNTIRSLSFVPGNDFEMSKHP
1950 1960 1970 1980 1990 2000
1960 1970 1980 1990 2000 2010
pF1KE1 GLVLILGKLILLHHEHPERKRAPQTYEKEEDEDKGVACSKDEWWWDCLEVLRDNTLVTLA
::.:::::::::::.:::::.:: ::::::..:.::.:.: ::::::::.::.:::::::
CCDS28 GLLLILGKLILLHHKHPERKQAPLTYEKEEEQDQGVSCNKVEWWWDCLEMLRENTLVTLA
2010 2020 2030 2040 2050 2060
2020 2030 2040 2050 2060 2070
pF1KE1 NISGQLDLSAYTESICLPILDGLLHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLETLCKL
::::::::: : ::::::.::::::: ::::::::::: :.:::.:::::::::::: ::
CCDS28 NISGQLDLSPYPESICLPVLDGLLHWAVCPSAEAQDPFSTLGPNAVLSPQRLVLETLSKL
2070 2080 2090 2100 2110 2120
2080 2090 2100 2110 2120 2130
pF1KE1 SIQDNNVDLILATPPFSRQEKFYATLVRYVGDRKNPVCREMSMALLSNLAQGDALAARAI
:::::::::::::::::: ::.:.:.::...::::::::::...::.::::::.::::::
CCDS28 SIQDNNVDLILATPPFSRLEKLYSTMVRFLSDRKNPVCREMAVVLLANLAQGDSLAARAI
2130 2140 2150 2160 2170 2180
2140 2150 2160 2170 2180 2190
pF1KE1 AVQKGSIGNLISFLEDGVTMAQYQQSQHNLMHMQPPPLEPPSVDMMCRAAKALLAMARVD
::::::::::..::::... .:.:::: .:.::: ::.:: ::::: :::.::::.:.::
CCDS28 AVQKGSIGNLLGFLEDSLAATQFQQSQASLLHMQNPPFEPTSVDMMRRAARALLALAKVD
2190 2200 2210 2220 2230 2240
2200 2210 2220 2230
pF1KE1 ENRSEFLLHEGRLLDISISAVLNSLVASVICDVLFQIGQL
::.::: :.:.::::::.: ..::::..::::::: :::
CCDS28 ENHSEFTLYESRLLDISVSPLMNSLVSQVICDVLFLIGQS
2250 2260 2270 2280
>>CCDS44091.1 ARID1A gene_id:8289|Hs108|chr1 (2068 aa)
initn: 3829 init1: 1919 opt: 3360 Z-score: 1301.2 bits: 254.5 E(32554): 3.7e-66
Smith-Waterman score: 6385; 49.7% identity (66.5% similar) in 2326 aa overlap (58-2230:25-2067)
30 40 50 60 70 80
pF1KE1 HHHHAHHHHHHAHHLHHHHALQQQLNQFQQQQQQQQQQQQQQQQQQHPISNNNSLGGAGG
.. .:::... . . . . :..
CCDS44 MAAQVAPAAASSLGNPPPPPPSELKKAEQQQREEAGGEAAAAAAAERGEMKAAA
10 20 30 40 50
90 100 110 120 130 140
pF1KE1 GAPQPGPDMEQPQHGGAKDSAAGGQADPPGPPLLSKPGDEDDAPPKMGEPAGGRYEHPGL
: . :: . :: : :. :.... : . : : : . . :. .:.:
CCDS44 GQESEGPAVGPPQPLG-KELQDGAESNGGGGGGGAGSGGGPGAEPDLKNSNGNAGPRPAL
60 70 80 90 100 110
150 160 170 180 190
pF1KE1 GALGTQQPPVAVPGGGGGPA---AVPEFNNYYGSAAPASG-G-PGGR--------AGPCF
. . .:: ::::: . ..: . . :: : : : :: :. :
CCDS44 NN-NLTEPP---GGGGGGSSDGVGAPPHSAAAALPPPAYGFGQPYGRSPSAVAAAAAAVF
120 130 140 150 160
200 210 220 230 240 250
pF1KE1 -DQHGGQQSPGMGMMHSASAAAAGAPGSMDPLQNSHE-GYPNSQCNHYPGYSRPGAGGGG
.:::::::::.. ..:... .: : ::::. :.:: : : : : .:
CCDS44 HQQHGGQQSPGLAALQSGGG--GGLEPYAGPQQNSHDHGFPNHQYNSY--YPNRSAYPPP
170 180 190 200 210 220
260 270 280 290 300 310
pF1KE1 GGGGGGGGGSGGGGGGGGAGAGGAGAGAVAAAAAAAAAAAGGGGGGGYGGSSAGYGVLSS
. . . .. :: :.:.:.:.:. ..:.:.... ::
CCDS44 APAYALSSPRGGTPGSGAAAAAGSKPPPSSSASASSSS--------------------SS
230 240 250 260
320 330 340 350 360 370
pF1KE1 PRQQGGGMMMGPGGGGAASLSKAAAGSAAGGFQRFAGQNQHPSGATPTLNQLLTSPSPMM
:: : : :::: ::::: : : :. :::::::::::::
CCDS44 FAQQRFGAM---GGGGP---------SAAGG-----GTPQ-PT-ATPTLNQLLTSPSSA-
270 280 290 300
380 390 400 410 420 430
pF1KE1 RSYGGSYP--EYSSPSAPPPPPSQPQSQAAAAGAAAGGQQAAAGMGLGKDMGAQYAAASP
:.: : :: .::. ::. .:. : : ..:. : .: :::.
CCDS44 RGYQG-YPGGDYSGG---------PQDGGAGKGPA---DMASQCWG----AAAAAAAAAA
310 320 330 340
440 450 460 470 480
pF1KE1 AWAAAQQRSHPA-MSPGTPG----PTMGRSQ-GSPMDPMVMKRPQLYGMGSNPHSQ----
: ..:::::: : ::::. : : : .:::: : ::: :: :.::.::
CCDS44 ASGGAQQRSHHAPMSPGSSGGGGQPLARTPQPSSPMDQMGKMRPQPYG-GTNPYSQQQGP
350 360 370 380 390 400
490 500 510 520 530
pF1KE1 ---PQQSSPYPGGSYGPPGPQRYPIGIQGRTPGAMAGMQYPQQQMPPQYGQQGVSGYCQQ
:::. ::: :: :::::. .:::. .::.:..: :: .:: ::::: ::: ::
CCDS44 PSGPQQGHGYPGQPYGSQTPQRYPMTMQGRAQSAMGGLSYTQQ-IPP-YGQQGPSGYGQQ
410 420 430 440 450 460
540 550
pF1KE1 GQQPYY--------------SQQP-----------------QPPHL----PPQAQY----
:: ::: :::: :::.: :: .:
CCDS44 GQTPYYNQQSPHPQQQQPPYSQQPPSQTPHAQPSYQQQPQSQPPQLQSSQPPYSQQPSQP
470 480 490 500 510 520
560
pF1KE1 --------LPSQ------------------------------------------------
:::
CCDS44 PHQQSPAPYPSQQSTTQQHPQSQPPYSQPQAQSPYQQQQPQQPAPSTLSQQAAYPQPQSQ
530 540 550 560 570 580
570 580 590 600 610
pF1KE1 -------SQQRYQPQQDMSQEGYGTR--SQPPLAPGKPNHEDLNLIQQERPSSLPDLSGS
::::. : :..::...:.. : : .. .: ..::.:: : :::::::::::
CCDS44 QSQQTAYSQQRFPPPQELSQDSFGSQASSAPSMTSSKGGQEDMNLSLQSRPSSLPDLSGS
590 600 610 620 630 640
620 630 640 650 660 670
pF1KE1 IDDLPTGTEATLSSAVSASGSTSSQGDQSNPAQSPFSPHASPHLSSIPGGPSPSPVGSPV
::::: :::..:: .::.:: .::::.::::::::::::.:::: .: : :::::::::.
CCDS44 IDDLPMGTEGALSPGVSTSGISSSQGEQSNPAQSPFSPHTSPHLPGIRG-PSPSPVGSPA
650 660 670 680 690 700
680 690 700 710 720 730
pF1KE1 GSNQSRSGPISPASIPGSQMPPQPPGSQSESSSHPALSQSPMPQERGFMAGTQRNPQMAQ
. ::::::.:::..::.::::.::..::.: ::...:: . :.::.: :::::: :
CCDS44 SVAQSRSGPLSPAAVPGNQMPPRPPSGQSDSIMHPSMNQSSIAQDRGYM---QRNPQMPQ
710 720 730 740 750 760
740 750 760 770 780 790
pF1KE1 YGPQQTGPSMSPHPSPGGQMHAGISSFQQSNSSGTYGPQMSQYGPQGNYSRPPAYSGVPS
:. : : ..::. :::.:.:..:.:: :: :.:::: .::::::.: : : :...:.
CCDS44 YSSPQPGSALSPRQPSGGQIHTGMGSYQQ-NSMGSYGPQGGQYGPQGGYPRQPNYNALPN
770 780 790 800 810 820
800 810 820 830 840 850
pF1KE1 ASYSGPGPGMGIS---ANNQMHGQGPSQPCGAVPLGRMPSAGMQNRPFPGNMSSMTPSSP
:.: . : . ::. :..::::: : :..: ::: :.: :::. ::..: :
CCDS44 ANYPSAGMAGGINPMGAGGQMHGQPGIPPYGTLPPGRMSHASMGNRPYGPNMANMPP---
830 840 850 860 870
860 870 880 890 900 910
pF1KE1 GMSQQGGPGMGPPMPTVNRKAQEAAAAVMQAAANSAQSRQGSFPGMNQSGLMASSSPYSQ
: : :: :: .:::.::.:.: :..:::: :.: ..:.:::.:.:... ::.:
CCDS44 ----QVGSGMCPPPGGMNRKTQETAVA-MHVAANSIQNRPPGYPNMNQGGMMGTGPPYGQ
880 890 900 910 920 930
920 930 940 950 960 970
pF1KE1 PMNNSSSLMNTQAPPYSMAPAMVNSSAASVGLADMMSPGESKLPLPLKADGKEEGTPQPE
.:. ....: :.:::::. .:.:.::. .. .::. :. :: : ..: .:::. :
CCDS44 GINSMAGMINPQGPPYSMGGTMANNSAGMAASPEMMGLGDVKLTPATKMNNKADGTPKTE
940 950 960 970 980 990
980 990 1000 1010 1020 1030
pF1KE1 SKSKDSYSSQGISQPPTPGNLPVPSPMSPSSASISSFHGDESDSISSPGWPKTPSSPKSS
:::: :::
CCDS44 SKSK-----------------------------------------------------KSS
1040 1050 1060 1070 1080 1090
pF1KE1 SSTTTGEKITKVYELGNEPERKLWVDRYLTFMEERGSPVSSLPAVGKKPLDLFRLYVCVK
:::::.:::::.::::.:::::.::::::.: ::.. ...:::::.:::::.:::: ::
CCDS44 SSTTTNEKITKLYELGGEPERKMWVDRYLAFTEEKAMGMTNLPAVGRKPLDLYRLYVSVK
1000 1010 1020 1030 1040 1050
1100 1110 1120 1130 1140 1150
pF1KE1 EIGGLAQVNKNKKWRELATNLNVGTSSSAASSLKKQYIQYLFAFECKIERGEEPPPEVFS
:::::.::::::::::::::::::::::::::::::::: :.::::::::::.:::..:.
CCDS44 EIGGLTQVNKNKKWRELATNLNVGTSSSAASSLKKQYIQCLYAFECKIERGEDPPPDIFA
1060 1070 1080 1090 1100 1110
1160 1170 1180 1190 1200
pF1KE1 TGDTKK-QPKLQPPSPANSGSLQGPQTPQSTGSNSMAEVPGDLKPPTPASTPHGQMTPMQ
..:.:: :::.::::::.:::.::::::::: :.:::: :::::::::::::.:. :.
CCDS44 AADSKKSQPKIQPPSPAGSGSMQGPQTPQST-SSSMAE-GGDLKPPTPASTPHSQIPPLP
1120 1130 1140 1150 1160 1170
1210 1220 1230 1240 1250 1260
pF1KE1 G-GRSSTISVHDPFSDVSDSSFPKRNSMTPNAPYQQGMSMPDVMGRMPYEPNKDPFGGMR
: .::......: :.: :::.: :::::::: :: .:. :.:::: :::::::.:.::
CCDS44 GMSRSNSVGIQDAFNDGSDSTFQKRNSMTPNPGYQPSMNTSDMMGRMSYEPNKDPYGSMR
1180 1190 1200 1210 1220 1230
1270 1280 1290 1300 1310 1320
pF1KE1 KVPGSSEPFMTQGQMPNSSMQDMYNQSPSGAMSNLGMGQRQQFPYGASYDR-RHEPYGQQ
:.::: .:::..:: ::..: : :... . ...:..:: ::..:::. ::: : :
CCDS44 KAPGS-DPFMSSGQGPNGGMGDPYSRAAGPGLGNVAMGPRQHYPYGGPYDRVRTE-----
1240 1250 1260 1270 1280 1290
1330 1340 1350 1360 1370 1380
pF1KE1 YPGQGPPSGQPPYGGHQPGLYPQQPNYKRHMDGMYGPPAKRHEGDMYNMQYSSQQQEMYN
:: :: :. :. ::.:.:..:. .:::.: . : : ..:::. ..
CCDS44 -PGIGPE-GNMSTGAPQPNLMPSNPD-----SGMYSP-------SRYPPQQQQQQQQRHD
1300 1310 1320 1330
1390 1400 1410 1420 1430 1440
pF1KE1 QYGGSYSGPDRRPIQGQYPYPYSRERMQGPGQIQTHGIPPQMMGGPLQSSSSEGPQQNMW
.::...: :.: : :.:
CCDS44 SYGNQFS---------------------------TQGTP----------SGS--------
1340 1350
1450 1460 1470 1480 1490 1500
pF1KE1 AARNDMPYPYQNRQGPGGPTQAPPYPGMNRTDDMMVPDQRINHESQWPSHVSQRQPYMSS
:.: : :. : :.: .::
CCDS44 ------PFPSQ---------QTTMY---------------------------QQQQQVSS
1360 1370
1510 1520 1530 1540 1550 1560
pF1KE1 SASMQPITRPPQPSYQTPPSLPNHISRAPSPASFQRSLENRMSPSKSPFLPS-MKMQKVM
: :. :: .::: :::::::: : :::::.
CCDS44 PA---PLPRP---------------------------MENRTSPSKSPFLHSGMKMQKAG
1380 1390 1400
1570 1580 1590 1600 1610 1620
pF1KE1 PTVPTSQVTGPPPQPPPIRREITFPPGSVEASQPVLKQRRKITSKDIVTPEAWRVMMSLK
: ::.:... : ::: :::.::::::::::.::::::::..: ::: ::::::::::::
CCDS44 PPVPASHIAPAPVQPPMIRRDITFPPGSVEATQPVLKQRRRLTMKDIGTPEAWRVMMSLK
1410 1420 1430 1440 1450 1460
1630 1640 1650 1660 1670 1680
pF1KE1 SGLLAESTWALDTINILLYDDSTVATFNLSQLSGFLELLVEYFRKCLIDIFGILMEYEVG
:::::::::::::::::::::... ::::::: :.:::::::::.:::.::::: :::::
CCDS44 SGLLAESTWALDTINILLYDDNSIMTFNLSQLPGLLELLVEYFRRCLIEIFGILKEYEVG
1470 1480 1490 1500 1510 1520
1690 1700 1710 1720 1730 1740
pF1KE1 DPSQKALDHNAARKDDSQSLADDSGKEEEDAECIDDDEEDEEDEEEDSEKTESDEKSSIA
::.:..: . .: . .: : : :::. : . :.::.:: .:.::. ::
CCDS44 DPGQRTL-LDPGRFSKVSSPAPMEGGEEEE-ELLGPKLEEEEEEEV----VENDEE--IA
1530 1540 1550 1560 1570
1750 1760 1770 1780 1790 1800
pF1KE1 LTAPDAAADPKEKPKQASKFDKLPIKIVKKNNLFVVDRSDKLGRVQEFNSGLLHWQLGGG
... : :. . . : :::::::.:::.::. :::: ::::::::::.::::::..:::
CCDS44 FSGKDKPASENSEEKLISKFDKLPVKIVQKNDPFVVDCSDKLGRVQEFDSGLLHWRIGGG
1580 1590 1600 1610 1620 1630
1810 1820 1830 1840 1850
pF1KE1 DTTEHIQTHFESKMEIPPRRRPPPPLSSAGRKK--EQEGKGDSEEQQ--------EKSII
::::::::::::: :. : : : : : ::. :: . .:. :: :
CCDS44 DTTEHIQTHFESKTELLPSR-PHAPCPPAPRKHVTTAEGTPGTTDQEGPPPDGPPEKRIT
1640 1650 1660 1670 1680 1690
1860 1870 1880 1890 1900 1910
pF1KE1 ATIDDVLSARPGALPEDANPGPQT--ESSKFPFGIQQAKSHRNIKLLEDEPRSRDETPLC
::.::.::.: ..: ::. . .. :::::::::. :.::::::.:::::.:.::::::
CCDS44 ATMDDMLSTRSSTLTEDGAKSSEAIKESSKFPFGISPAQSHRNIKILEDEPHSKDETPLC
1700 1710 1720 1730 1740 1750
1920 1930 1940 1950 1960 1970
pF1KE1 TIAHWQDSLAKRCICVSNIVRSLSFVPGNDAEMSKHPGLVLILGKLILLHHEHPERKRAP
:. :::::::::.:::: .:::::::::: ::::::::.:::::::::::.:::::.::
CCDS44 TLLDWQDSLAKRCVCVSNTIRSLSFVPGNDFEMSKHPGLLLILGKLILLHHKHPERKQAP
1760 1770 1780 1790 1800 1810
1980 1990 2000 2010 2020 2030
pF1KE1 QTYEKEEDEDKGVACSKDEWWWDCLEVLRDNTLVTLANISGQLDLSAYTESICLPILDGL
::::::..:.::.:.: ::::::::.::.:::::::::::::::: : ::::::.::::
CCDS44 LTYEKEEEQDQGVSCNKVEWWWDCLEMLRENTLVTLANISGQLDLSPYPESICLPVLDGL
1820 1830 1840 1850 1860 1870
2040 2050 2060 2070 2080 2090
pF1KE1 LHWMVCPSAEAQDPFPTVGPNSVLSPQRLVLETLCKLSIQDNNVDLILATPPFSRQEKFY
::: ::::::::::: :.:::.:::::::::::: :::::::::::::::::::: ::.:
CCDS44 LHWAVCPSAEAQDPFSTLGPNAVLSPQRLVLETLSKLSIQDNNVDLILATPPFSRLEKLY
1880 1890 1900 1910 1920 1930
2100 2110 2120 2130 2140 2150
pF1KE1 ATLVRYVGDRKNPVCREMSMALLSNLAQGDALAARAIAVQKGSIGNLISFLEDGVTMAQY
.:.::...::::::::::...::.::::::.::::::::::::::::..::::... .:.
CCDS44 STMVRFLSDRKNPVCREMAVVLLANLAQGDSLAARAIAVQKGSIGNLLGFLEDSLAATQF
1940 1950 1960 1970 1980 1990
2160 2170 2180 2190 2200 2210
pF1KE1 QQSQHNLMHMQPPPLEPPSVDMMCRAAKALLAMARVDENRSEFLLHEGRLLDISISAVLN
:::: .:.::: ::.:: ::::: :::.::::.:.::::.::: :.:.::::::.: ..:
CCDS44 QQSQASLLHMQNPPFEPTSVDMMRRAARALLALAKVDENHSEFTLYESRLLDISVSPLMN
2000 2010 2020 2030 2040 2050
2220 2230
pF1KE1 SLVASVICDVLFQIGQL
:::..::::::: :::
CCDS44 SLVSQVICDVLFLIGQS
2060
2231 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Sun Nov 6 20:00:58 2016 done: Sun Nov 6 20:00:59 2016
Total Scan time: 5.660 Total Display time: 0.440
Function used was FASTA [36.3.4 Apr, 2011]