FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE3424, 504 aa 1>>>pF1KE3424 504 - 504 aa - 504 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 9.3561+/-0.00097; mu= 2.8516+/- 0.059 mean_var=322.3813+/-67.355, 0's: 0 Z-trim(115.5): 26 B-trim: 91 in 1/50 Lambda= 0.071431 statistics sampled from 16065 (16084) to 16065 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.782), E-opt: 0.2 (0.494), width: 16 Scan time: 4.040 The best scores are: opt bits E(32554) CCDS42440.1 ONECUT2 gene_id:9480|Hs108|chr18 ( 504) 3466 370.7 2.1e-102 CCDS10150.1 ONECUT1 gene_id:3175|Hs108|chr15 ( 465) 1849 204.0 2.9e-52 CCDS45900.1 ONECUT3 gene_id:390874|Hs108|chr19 ( 494) 1038 120.5 4.4e-27 >>CCDS42440.1 ONECUT2 gene_id:9480|Hs108|chr18 (504 aa) initn: 3466 init1: 3466 opt: 3466 Z-score: 1951.9 bits: 370.7 E(32554): 2.1e-102 Smith-Waterman score: 3466; 100.0% identity (100.0% similar) in 504 aa overlap (1-504:1-504) 10 20 30 40 50 60 pF1KE3 MKAAYTAYRCLTKDLEGCAMNPELTMESLGTLHGPAGGGSGGGGGGGGGGGGGGPGHEQE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS42 MKAAYTAYRCLTKDLEGCAMNPELTMESLGTLHGPAGGGSGGGGGGGGGGGGGGPGHEQE 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE3 LLASPSPHHAGRGAAGSLRGPPPPPTAHQELGTAAAAAAAASRSAMVTSMASILDGGDYR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS42 LLASPSPHHAGRGAAGSLRGPPPPPTAHQELGTAAAAAAAASRSAMVTSMASILDGGDYR 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE3 PELSIPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQPLPPISTVSDKFHHPHPHHHPHHHH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS42 PELSIPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQPLPPISTVSDKFHHPHPHHHPHHHH 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE3 HHHHQRLSGNVSGSFTLMRDERGLPAMNNLYSPYKEMPGMSQSLSPLAATPLGNGLGGLH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS42 HHHHQRLSGNVSGSFTLMRDERGLPAMNNLYSPYKEMPGMSQSLSPLAATPLGNGLGGLH 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE3 NAQQSLPNYGPPGHDKMLSPNFDAHHTAMLTRGEQHLSRGLGTPPAAMMSHLNGLHHPGH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS42 NAQQSLPNYGPPGHDKMLSPNFDAHHTAMLTRGEQHLSRGLGTPPAAMMSHLNGLHHPGH 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE3 TQSHGPVLAPSRERPPSSSSGSQVATSGQLEEINTKEVAQRITAELKRYSIPQAIFAQRV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS42 TQSHGPVLAPSRERPPSSSSGSQVATSGQLEEINTKEVAQRITAELKRYSIPQAIFAQRV 310 320 330 340 350 360 370 380 390 400 410 420 pF1KE3 LCRSQGTLSDLLRNPKPWSKLKSGRETFRRMWKWLQEPEFQRMSALRLAACKRKEQEPNK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS42 LCRSQGTLSDLLRNPKPWSKLKSGRETFRRMWKWLQEPEFQRMSALRLAACKRKEQEPNK 370 380 390 400 410 420 430 440 450 460 470 480 pF1KE3 DRNNSQKKSRLVFTDLQRRTLFAIFKENKRPSKEMQITISQQLGLELTTVSNFFMNARRR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS42 DRNNSQKKSRLVFTDLQRRTLFAIFKENKRPSKEMQITISQQLGLELTTVSNFFMNARRR 430 440 450 460 470 480 490 500 pF1KE3 SLEKWQDDLSTGGSSSTSSTCTKA :::::::::::::::::::::::: CCDS42 SLEKWQDDLSTGGSSSTSSTCTKA 490 500 >>CCDS10150.1 ONECUT1 gene_id:3175|Hs108|chr15 (465 aa) initn: 1348 init1: 1017 opt: 1849 Z-score: 1051.7 bits: 204.0 E(32554): 2.9e-52 Smith-Waterman score: 1890; 63.4% identity (76.3% similar) in 514 aa overlap (20-504:1-465) 10 20 30 40 50 60 pF1KE3 MKAAYTAYRCLTKDLEGCAMNPELTMESLGTLHGPAGGGSGGGGGGGGGGGGGGPGHEQE :: .::::..: ::: . . . :: CCDS10 MNAQLTMEAIGELHGVSHEPVPAPADLLGG----------- 10 20 30 70 80 90 100 110 pF1KE3 LLASPSPHHAGRGAAGSLRGPPPPPTAHQELGTAAAAAAAASRSAMVTSMASILDGG--- ::: .:.... :: ::. . .: :::.:::: CCDS10 -----SPH--ARSSVAH-RGSHLPPAHPRSMG-----------------MASLLDGGSGG 40 50 60 120 130 140 150 160 pF1KE3 -DYR-----PELSI--PLHHAMSMSCDSSPPGMGMSNTYTTLTPLQPLPPISTVSDKFHH ::. :: :. ::: .:.:.:.. ::::.: .::::::::::::::::::::: CCDS10 GDYHHHHRAPEHSLAGPLHPTMTMACET-PPGMSMPTTYTTLTPLQPLPPISTVSDKF-- 70 80 90 100 110 120 170 180 190 200 210 220 pF1KE3 PHPHHHPHHHHH-HHHQRLSGNVSGSFTLMRDERGLPAMNNLYSPY-KEMPGMSQSLSPL :: ::: ::::: ::::::.:::::::::::::::: .:::::.:: :.. ::.:::::: CCDS10 PHHHHHHHHHHHPHHHQRLAGNVSGSFTLMRDERGLASMNNLYTPYHKDVAGMGQSLSPL 130 140 150 160 170 180 230 240 250 260 270 280 pF1KE3 AATPLGNGLGGLHNAQQSLPNYGPPGH----DKMLSPN-FDAHHTAMLTR-GEQHLSRGL ... :::..::.::.::.:. :: ::::.:: :.::: ::: : ::::: CCDS10 SSS----GLGSIHNSQQGLPHYAHPGAAMPTDKMLTPNGFEAHHPAMLGRHGEQHL---- 190 200 210 220 230 290 300 310 320 330 pF1KE3 GTPPAAMMSHLNGL--HHP-GH--TQSHGPVLAPSRERPPSSSSGSQVAT---SGQLEEI :: .: : .::: ::: .: .:.:: .:. .:: : : .:.::.. :::.::: CCDS10 -TPTSAGMVPINGLPPHHPHAHLNAQGHGQLLGTARE-PNPSVTGAQVSNGSNSGQMEEI 240 250 260 270 280 290 340 350 360 370 380 390 pF1KE3 NTKEVAQRITAELKRYSIPQAIFAQRVLCRSQGTLSDLLRNPKPWSKLKSGRETFRRMWK ::::::::::.::::::::::::::::::::::::::::::::::::::::::::::::: CCDS10 NTKEVAQRITTELKRYSIPQAIFAQRVLCRSQGTLSDLLRNPKPWSKLKSGRETFRRMWK 300 310 320 330 340 350 400 410 420 430 440 450 pF1KE3 WLQEPEFQRMSALRLAACKRKEQEPNKDRNNSQKKSRLVFTDLQRRTLFAIFKENKRPSK :::::::::::::::::::::::: .:::.:. :: ::::::.::::: ::::::::::: CCDS10 WLQEPEFQRMSALRLAACKRKEQEHGKDRGNTPKKPRLVFTDVQRRTLHAIFKENKRPSK 360 370 380 390 400 410 460 470 480 490 500 pF1KE3 EMQITISQQLGLELTTVSNFFMNARRRSLEKWQDDLST--GGSSSTSSTCTKA :.::::::::::::.::::::::::::::.::::. :. :.:::.::::::: CCDS10 ELQITISQQLGLELSTVSNFFMNARRRSLDKWQDEGSSNSGNSSSSSSTCTKA 420 430 440 450 460 >>CCDS45900.1 ONECUT3 gene_id:390874|Hs108|chr19 (494 aa) initn: 1167 init1: 949 opt: 1038 Z-score: 599.7 bits: 120.5 E(32554): 4.4e-27 Smith-Waterman score: 1452; 52.5% identity (69.5% similar) in 531 aa overlap (23-504:2-494) 10 20 30 40 50 60 pF1KE3 MKAAYTAYRCLTKDLEGCAMNPELTMESLGTLHGPAGGGSGGGGGGGGGGGGGGPGHEQE ::..:::: ::. : . .: : CCDS45 MELSLESLGGLHSVAHAQAG------------------E 10 20 70 80 90 100 110 pF1KE3 LLASPSPHHAGRGAAGSLRGPPPPPTAHQELGTAA----AAAAAASRSAMVTSMASILDG :: :: :: :.::.. :: : : :. ....... .. . . .: : CCDS45 LL---SPGHA-RSAAAQHRGLVAPGRPGLVAGMASLLDGGGGGGGGGAGGAGGAGSAGGG 30 40 50 60 70 120 130 140 150 160 170 pF1KE3 GDYRPELSIPLHHAMSMSCDSSPPGMGMSNTYTTLTPLQPLPPISTVSDKFHHP------ .:.: ::. ::: ::.:.:.. ::.: .::::::::: :::...:.::::. CCDS45 ADFRGELAGPLHPAMGMACEA--PGLG--GTYTTLTPLQHLPPLAAVADKFHQHAAAAAV 80 90 100 110 120 130 180 190 200 210 pF1KE3 ------HPHHHPHHHHHHHH----QRLSGNVSGSFTLMRDERG-LPAMNNLYSPY-KEMP ::: ::: :::...::::::::::::. : ....::.:: ::.: CCDS45 AGAHGGHPHAHPHPAAAPPPPPPPQRLAASVSGSFTLMRDERAALASVGHLYGPYGKELP 140 150 160 170 180 190 220 230 240 250 260 pF1KE3 GMSQSLSPLAATPLGNGLG-GLHNAQQSLP--------NYGPPGH---DKMLSPNFDAHH .:. ::: .:: :.: .::.: : : :::::: ::.: : : CCDS45 AMG---SPL--SPLPNALPPALHGAPQPPPPPPPPPLAAYGPPGHLAGDKLLPPAAFEPH 200 210 220 230 240 270 280 290 300 310 pF1KE3 TAMLTRGEQHLSRGL------------GTPPAA-MMSHLNGLHHPGHTQSHGPVLAPSRE .:.: :.:. :.::: :. :: ... :.:: : .::: . . CCDS45 AALLGRAEDALARGLPGGGGGTGSGGAGSGSAAGLLAPLGGLAAAG---AHGPHGGGG-- 250 260 270 280 290 300 320 330 340 350 360 370 pF1KE3 RPPSSSSGSQVATSGQLEEINTKEVAQRITAELKRYSIPQAIFAQRVLCRSQGTLSDLLR :..:.:. : .. :::::::::::::::::::::::::::::.::::::::::::: CCDS45 -GPGGSGGGPSAGAAA-EEINTKEVAQRITAELKRYSIPQAIFAQRILCRSQGTLSDLLR 310 320 330 340 350 360 380 390 400 410 420 430 pF1KE3 NPKPWSKLKSGRETFRRMWKWLQEPEFQRMSALRLAACKRKEQEPNKDRNNSQKKSRLVF :::::::::::::::::::::::::::::::::::::::::::: .:.: . ::.:::: CCDS45 NPKPWSKLKSGRETFRRMWKWLQEPEFQRMSALRLAACKRKEQEQQKERALQPKKQRLVF 370 380 390 400 410 420 440 450 460 470 480 490 pF1KE3 TDLQRRTLFAIFKENKRPSKEMQITISQQLGLELTTVSNFFMNARRRSLEKWQDDLST-- ::::::::.::::::::::::::.::::::::::.:::::::::::: ...: .. :: CCDS45 TDLQRRTLIAIFKENKRPSKEMQVTISQQLGLELNTVSNFFMNARRRCMNRWAEEPSTAP 430 440 450 460 470 480 500 pF1KE3 GGSSSTSSTCTKA :: .....: .:: CCDS45 GGPAGATATFSKA 490 504 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Tue Nov 8 04:32:37 2016 done: Tue Nov 8 04:32:38 2016 Total Scan time: 4.040 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]