FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE5739, 609 aa 1>>>pF1KE5739 609 - 609 aa - 609 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 8.1393+/-0.000806; mu= 7.5653+/- 0.049 mean_var=141.8112+/-29.215, 0's: 0 Z-trim(112.7): 37 B-trim: 159 in 1/54 Lambda= 0.107701 statistics sampled from 13415 (13449) to 13415 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.728), E-opt: 0.2 (0.413), width: 16 Scan time: 2.420 The best scores are: opt bits E(32554) CCDS34723.1 KMT2E gene_id:55904|Hs108|chr7 (1858) 3881 614.6 3.7e-175 CCDS46741.1 SETD5 gene_id:55209|Hs108|chr3 (1442) 977 163.3 2e-39 CCDS74892.1 SETD5 gene_id:55209|Hs108|chr3 (1344) 780 132.7 3.1e-30 >>CCDS34723.1 KMT2E gene_id:55904|Hs108|chr7 (1858 aa) initn: 3881 init1: 3881 opt: 3881 Z-score: 3258.1 bits: 614.6 E(32554): 3.7e-175 Smith-Waterman score: 3881; 99.5% identity (99.7% similar) in 578 aa overlap (1-578:1-578) 10 20 30 40 50 60 pF1KE5 MSIVIPLGVDTAETSYLEMAAGSEPESVEASPVVVEKSNSYPHQLYTSSSHHSHSYIGLP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 MSIVIPLGVDTAETSYLEMAAGSEPESVEASPVVVEKSNSYPHQLYTSSSHHSHSYIGLP 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 YADHNYGARPPPTPPASPPPSVLISKNEVGIFTTPNFDETSSATTISTSEDGSYGTDVTR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 YADHNYGARPPPTPPASPPPSVLISKNEVGIFTTPNFDETSSATTISTSEDGSYGTDVTR 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE5 CICGFTHDDGYMICCDKCSVWQHIDCMGIDRQHIPDTYLCERCQPRNLDKERAVLLQRRK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 CICGFTHDDGYMICCDKCSVWQHIDCMGIDRQHIPDTYLCERCQPRNLDKERAVLLQRRK 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE5 RENMSDGDTSATESGDEVPVELYTAFQHTPTSITLTASRVSKVNDKRRKKSGEKEQHISK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 RENMSDGDTSATESGDEVPVELYTAFQHTPTSITLTASRVSKVNDKRRKKSGEKEQHISK 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE5 CKKAFREGSRKSSRVKGSAPEIDPSSDGSNFGWETKIKAWMDRYEEANNNQYSEGVQREA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 CKKAFREGSRKSSRVKGSAPEIDPSSDGSNFGWETKIKAWMDRYEEANNNQYSEGVQREA 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE5 QRIALRLGNGNDKKEMNKSDLNTNNLLFKPPVESHIQKNKKILKSAKDLPPDALIIEYRG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 QRIALRLGNGNDKKEMNKSDLNTNNLLFKPPVESHIQKNKKILKSAKDLPPDALIIEYRG 310 320 330 340 350 360 370 380 390 400 410 420 pF1KE5 KFMLREQFEANGYFFKRPYPFVLFYSKFHGLEMCVDARTFGNEARFIRRSCTPNAEVRHE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 KFMLREQFEANGYFFKRPYPFVLFYSKFHGLEMCVDARTFGNEARFIRRSCTPNAEVRHE 370 380 390 400 410 420 430 440 450 460 470 480 pF1KE5 IQDGTIHLYIYSIHSIPKGTEITIAFDFDYGNCKYKVDCACLKENPECPVLKRSSESMEN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 IQDGTIHLYIYSIHSIPKGTEITIAFDFDYGNCKYKVDCACLKENPECPVLKRSSESMEN 430 440 450 460 470 480 490 500 510 520 530 540 pF1KE5 INSGYETRRKKGKKDKDISKEKDTQNQNITLDCEGTTNKMKSPETKQRKLSPLRLSVSNN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 INSGYETRRKKGKKDKDISKEKDTQNQNITLDCEGTTNKMKSPETKQRKLSPLRLSVSNN 490 500 510 520 530 540 550 560 570 580 590 600 pF1KE5 QEPDFIDDIEEKTPISNEVEMESEEQIAERKRKMVSWEASSLGLVTAALHMVIVAAFTWA ::::::::::::::::::::::::::::::::::. : CCDS34 QEPDFIDDIEEKTPISNEVEMESEEQIAERKRKMTREERKMEAILQAFARLEKREKRREQ 550 560 570 580 590 600 pF1KE5 FTLFFEVSE CCDS34 ALERISTAKTEVKTECKDTQIVSDAEVIQEQAKEENASKPTPAKVNRTKQRKSFSRSRTH 610 620 630 640 650 660 >>CCDS46741.1 SETD5 gene_id:55209|Hs108|chr3 (1442 aa) initn: 1183 init1: 726 opt: 977 Z-score: 821.2 bits: 163.3 E(32554): 2e-39 Smith-Waterman score: 1270; 41.4% identity (63.8% similar) in 594 aa overlap (1-572:1-508) 10 20 30 40 50 60 pF1KE5 MSIVIPLGVDTAETSYLEMAAGSEPESVEASPVVVEKSNSYPHQLYTSSSHHSHSYIGLP :::.::::: :..::: .:::::.::::::::.: ::: :. :.. : . ::: CCDS46 MSIAIPLGVTTSDTSYSDMAAGSDPESVEASPAVNEKSVYSTHNYGTTQRHGCR---GLP 10 20 30 40 50 70 80 90 100 110 120 pF1KE5 YADHNYGARPPPTPPASPPPSVLISKNEVGIFTTPNFDETSSATTISTSEDGSYGTDVTR :: ..: ..... . .: .: . . : .: : : CCDS46 YA-------------------TIIPRSDLNGLPSP-VEERCGDSPNSEGE-----TVPTW 60 70 80 90 130 140 150 160 170 180 pF1KE5 CICGFTHDDGYMICCDKCSVWQHIDCMGIDRQHIPDTYLCERCQPRNLDKERAVLLQRRK : ::...: :... :::: :.... ... :.::: CCDS46 CPCGLSQD-GFLLNCDKC---------------------------RGMSRGKVIRLHRRK 100 110 120 190 200 210 220 230 pF1KE5 RENMSDGDTSATESGDE--VPVE-LYTAFQHTPTSITLTASRVSKVNDKRRKKSGEKEQH ..:.: ::.::::: :: : :::: :::::::::: : ... :.:::: :: . CCDS46 QDNISGGDSSATESWDEELSPSTVLYTATQHTPTSITLT---VRRTKPKKRKKSPEKGRA 130 140 150 160 170 180 240 250 260 270 280 290 pF1KE5 ISKCKKAFREGSRKSSRVKGSAPEIDPSSDGSNFGWETKIKAWMDRYEEANNNQYSEGVQ : :: .:.: : . ..... :::..:. : :.:::: .:::: :: CCDS46 APKTKK-----------IKNSPSEAQNLDENTTEGWENRIRLWTDQYEEAFTNQYSADVQ 190 200 210 220 230 300 310 320 330 340 pF1KE5 RE-AQRIALR---LGNGNDKKEMNKSDLNTNNLLFKPPVE------SHIQKNKKILKSAK :.. .:. . .::..: :: .. .. ...::..:::..:. CCDS46 NALEQHLHSSKEFVGKPTILDTINKTELACNNTVIGSQMQLQLGRVTRVQKHRKILRAAR 240 250 260 270 280 290 350 360 370 380 390 400 pF1KE5 DLPPDALIIEYRGKFMLREQFEANGYFFKRPYPFVLFYSKFHGLEMCVDARTFGNEARFI :: :.:::::::: :::.:::.::.:::.:::::::::::.:.:::::::::::.:::: CCDS46 DLALDTLIIEYRGKVMLRQQFEVNGHFFKKPYPFVLFYSKFNGVEMCVDARTFGNDARFI 300 310 320 330 340 350 410 420 430 440 450 460 pF1KE5 RRSCTPNAEVRHEIQDGTIHLYIYSIHSIPKGTEITIAFDFDYGNCKYKVDCACLKENPE :::::::::::: : :: ::: ::.. .: : .:.:::::..:.::.::::::: : : . CCDS46 RRSCTPNAEVRHMIADGMIHLCIYAVSAITKDAEVTIAFDYEYSNCNYKVDCACHKGNRN 360 370 380 390 400 410 470 480 490 500 510 pF1KE5 CPVLKRSSESME--------NINS-GYETRRKKGKKDKDISKEKDTQNQNITLDCEGTTN ::. ::. .. : .. . : ::::.:... :: . ..:: . .: CCDS46 CPIQKRNPNATELPLLPPPPSLPTIGAETRRRKARR-----KELEMEQQN---EASEENN 420 430 440 450 460 520 530 540 550 560 570 pF1KE5 KMKSPETKQRKLSPLRLSVSNNQEPDFIDDIEEKTPISNEVEMESEEQIAERKRKMVSWE ..: :. : ...::...: .:. ::: .: ....:..:. .: CCDS46 DQQSQEV------PEKVTVSSDHEE--VDNPEEKPEEEKEEVIDDQENLAHSRRTREDRK 470 480 490 500 510 580 590 600 pF1KE5 ASSLGLVTAALHMVIVAAFTWAFTLFFEVSE CCDS46 VEAIMHAFENLEKRKKRRDQPLEQSNSDVEITTTTSETPVGEETKTEAPESEVSNSVSNV 520 530 540 550 560 570 >>CCDS74892.1 SETD5 gene_id:55209|Hs108|chr3 (1344 aa) initn: 909 init1: 726 opt: 780 Z-score: 656.3 bits: 132.7 E(32554): 3.1e-30 Smith-Waterman score: 1157; 47.1% identity (71.1% similar) in 429 aa overlap (168-572:1-410) 140 150 160 170 180 190 pF1KE5 CSVWQHIDCMGIDRQHIPDTYLCERCQPRNLDKERAVLLQRRKRENMSDGDTSATESGDE ... ... :.:::..:.: ::.::::: :: CCDS74 MSRGKVIRLHRRKQDNISGGDSSATESWDE 10 20 30 200 210 220 230 240 250 pF1KE5 --VPVE-LYTAFQHTPTSITLTASRVSKVNDKRRKKSGEKEQHISKCKK--AFREGSRKS : :::: :::::::::: : ... :.:::: :: . : :: ::::::::: CCDS74 ELSPSTVLYTATQHTPTSITLT---VRRTKPKKRKKSPEKGRAAPKTKKIKAFREGSRKS 40 50 60 70 80 260 270 280 290 300 pF1KE5 SRVKGSAPEIDPSSDGSNFGWETKIKAWMDRYEEANNNQYSEGVQRE-AQRIALR---LG :.:.: : . ..... :::..:. : :.:::: .:::: :: :.. .: CCDS74 LRMKNSPSEAQNLDENTTEGWENRIRLWTDQYEEAFTNQYSADVQNALEQHLHSSKEFVG 90 100 110 120 130 140 310 320 330 340 350 360 pF1KE5 NGNDKKEMNKSDLNTNNLLFKPPVE------SHIQKNKKILKSAKDLPPDALIIEYRGKF . . .::..: :: .. .. ...::..:::..:.:: :.:::::::: CCDS74 KPTILDTINKTELACNNTVIGSQMQLQLGRVTRVQKHRKILRAARDLALDTLIIEYRGKV 150 160 170 180 190 200 370 380 390 400 410 420 pF1KE5 MLREQFEANGYFFKRPYPFVLFYSKFHGLEMCVDARTFGNEARFIRRSCTPNAEVRHEIQ :::.:::.::.:::.:::::::::::.:.:::::::::::.:::::::::::::::: : CCDS74 MLRQQFEVNGHFFKKPYPFVLFYSKFNGVEMCVDARTFGNDARFIRRSCTPNAEVRHMIA 210 220 230 240 250 260 430 440 450 460 470 pF1KE5 DGTIHLYIYSIHSIPKGTEITIAFDFDYGNCKYKVDCACLKENPECPVLKRSSESME--- :: ::: ::.. .: : .:.:::::..:.::.::::::: : : .::. ::. .. : CCDS74 DGMIHLCIYAVSAITKDAEVTIAFDYEYSNCNYKVDCACHKGNRNCPIQKRNPNATELPL 270 280 290 300 310 320 480 490 500 510 520 530 pF1KE5 -----NINS-GYETRRKKGKKDKDISKEKDTQNQNITLDCEGTTNKMKSPETKQRKLSPL .. . : ::::.:... :: . ..:: . .: ..: :. : CCDS74 LPPPPSLPTIGAETRRRKARR-----KELEMEQQN---EASEENNDQQSQEV------PE 330 340 350 360 370 540 550 560 570 580 590 pF1KE5 RLSVSNNQEPDFIDDIEEKTPISNEVEMESEEQIAERKRKMVSWEASSLGLVTAALHMVI ...::...: .:. ::: .: ....:..:. .: CCDS74 KVTVSSDHEE--VDNPEEKPEEEKEEVIDDQENLAHSRRTREDRKVEAIMHAFENLEKRK 380 390 400 410 420 430 600 pF1KE5 VAAFTWAFTLFFEVSE CCDS74 KRRDQPLEQSNSDVEITTTTSETPVGEETKTEAPESEVSNSVSNVTIPSTPQSVGVNTRR 440 450 460 470 480 490 609 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Tue Nov 8 06:15:51 2016 done: Tue Nov 8 06:15:52 2016 Total Scan time: 2.420 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]