FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE4418, 431 aa 1>>>pF1KE4418 431 - 431 aa - 431 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 7.2909+/-0.00136; mu= 6.5102+/- 0.079 mean_var=123.9561+/-25.602, 0's: 0 Z-trim(102.7): 152 B-trim: 5 in 1/49 Lambda= 0.115197 statistics sampled from 6891 (7058) to 6891 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.572), E-opt: 0.2 (0.217), width: 16 Scan time: 2.720 The best scores are: opt bits E(32554) CCDS13452.1 CSTF1 gene_id:1477|Hs108|chr20 ( 431) 2889 492.2 4e-139 CCDS3012.1 WDR5B gene_id:54554|Hs108|chr3 ( 330) 352 70.5 2.7e-12 CCDS6981.1 WDR5 gene_id:11091|Hs108|chr9 ( 334) 324 65.9 6.8e-11 >>CCDS13452.1 CSTF1 gene_id:1477|Hs108|chr20 (431 aa) initn: 2889 init1: 2889 opt: 2889 Z-score: 2611.2 bits: 492.2 E(32554): 4e-139 Smith-Waterman score: 2889; 100.0% identity (100.0% similar) in 431 aa overlap (1-431:1-431) 10 20 30 40 50 60 pF1KE4 MYRTKVGLKDRQQLYKLIISQLLYDGYISIANGLINEIKPQSVCAPSEQLLHLIKLGMEN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 MYRTKVGLKDRQQLYKLIISQLLYDGYISIANGLINEIKPQSVCAPSEQLLHLIKLGMEN 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE4 DDTAVQYAIGRSDTVAPGTGIDLEFDADVQTMSPEASEYETCYVTSHKGPCRVATYSRDG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 DDTAVQYAIGRSDTVAPGTGIDLEFDADVQTMSPEASEYETCYVTSHKGPCRVATYSRDG 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE4 QLIATGSADASIKILDTERMLAKSAMPIEVMMNETAQQNMENHPVIRTLYDHVDEVTCLA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 QLIATGSADASIKILDTERMLAKSAMPIEVMMNETAQQNMENHPVIRTLYDHVDEVTCLA 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE4 FHPTEQILASGSRDYTLKLFDYSKPSAKRAFKYIQEAEMLRSISFHPSGDFILVGTQHPT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 FHPTEQILASGSRDYTLKLFDYSKPSAKRAFKYIQEAEMLRSISFHPSGDFILVGTQHPT 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE4 LRLYDINTFQCFVSCNPQDQHTDAICSVNYNSSANMYVTGSKDGCIKLWDGVSNRCITTF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 LRLYDINTFQCFVSCNPQDQHTDAICSVNYNSSANMYVTGSKDGCIKLWDGVSNRCITTF 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE4 EKAHDGAEVCSAIFSKNSKYILSSGKDSVAKLWEISTGRTLVRYTGAGLSGRQVHRTQAV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 EKAHDGAEVCSAIFSKNSKYILSSGKDSVAKLWEISTGRTLVRYTGAGLSGRQVHRTQAV 310 320 330 340 350 360 370 380 390 400 410 420 pF1KE4 FNHTEDYVLLPDERTISLCCWDSRTAERRNLLSLGHNNIVRCIVHSPTNPGFMTCSDDFR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 FNHTEDYVLLPDERTISLCCWDSRTAERRNLLSLGHNNIVRCIVHSPTNPGFMTCSDDFR 370 380 390 400 410 420 430 pF1KE4 ARFWYRRSTTD ::::::::::: CCDS13 ARFWYRRSTTD 430 >>CCDS3012.1 WDR5B gene_id:54554|Hs108|chr3 (330 aa) initn: 231 init1: 115 opt: 352 Z-score: 334.3 bits: 70.5 E(32554): 2.7e-12 Smith-Waterman score: 356; 26.0% identity (58.2% similar) in 311 aa overlap (116-424:31-326) 90 100 110 120 130 140 pF1KE4 DADVQTMSPEASEYETCYVTSHKGPCRVATYSRDGQLIATGSADASIKILDTERMLAKSA :. :.. : .:.:. . . ::.:. CCDS30 MATKESRDAKAQLALSSSANQSKEVPENPNYALKCTLVGHTEAVSSVKFSPNGEWLASSS 10 20 30 40 50 60 150 160 170 180 190 200 pF1KE4 MPIEVMMNETAQQNMENHPVIRTLYDHVDEVTCLAFHPTEQILASGSRDYTLKLFDYSKP ... . . ..: .::: : :.. .:. . :.:.: : ::::.: . CCDS30 ADRLIIIWGAYDGKYE-----KTLYGHNLEISDVAWSSDSSRLVSASDDKTLKLWDVRSG 70 80 90 100 110 210 220 230 240 250 260 pF1KE4 SAKRAFKYIQEAEMLRSISFHPSGDFILVGTQHPTLRLYDINTFQCFVSCNPQDQHTDAI . ...: ..... .:.: ...:. :. :.......: .:. . . :.: . CCDS30 KCLKTLK--GHSNYVFCCNFNPPSNLIISGSFDETVKIWEVKTGKCLKTLS---AHSDPV 120 130 140 150 160 170 270 280 290 300 310 320 pF1KE4 CSVNYNSSANMYVTGSKDGCIKLWDGVSNRCITTFEKAHDGAEVCSAIFSKNSKYILSSG .:..: :... :.:: :: ..::..:..:. :. :. : . :: :.::::.. CCDS30 SAVHFNCSGSLIVSGSYDGLCRIWDAASGQCLKTLVD-DDNPPVSFVKFSPNGKYILTAT 180 190 200 210 220 330 340 350 360 370 380 pF1KE4 KDSVAKLWEISTGRTLVRYTGAGLSGRQVHRTQAVFNHTEDYVLLPDERTISLCCWDSRT :.. :::. : :: : ::: . : :. : .. . . :. .: CCDS30 LDNTLKLWDYSRGRCLKTYTGHKNEKYCIF---ANFSVTGGKWIVSGSEDNLVYIWNLQT 230 240 250 260 270 280 390 400 410 420 430 pF1KE4 AERRNLLSLGHNNIVRCIVHSPTNPGFMTCS--DDFRARFWYRRSTTD : . :. ::...: . ::. . . . .: ..: CCDS30 KEIVQKLQ-GHTDVVISAACHPTENLIASAALENDKTIKLWMSNH 290 300 310 320 330 >>CCDS6981.1 WDR5 gene_id:11091|Hs108|chr9 (334 aa) initn: 278 init1: 110 opt: 324 Z-score: 309.0 bits: 65.9 E(32554): 6.8e-11 Smith-Waterman score: 324; 26.2% identity (60.0% similar) in 260 aa overlap (167-424:81-330) 140 150 160 170 180 190 pF1KE4 TERMLAKSAMPIEVMMNETAQQNMENHPVIRTLYDHVDEVTCLAFHPTEQILASGSRDYT .:. : .. .:. ..:.:.: : : CCDS69 VKFSPNGEWLASSSADKLIKIWGAYDGKFEKTISGHKLGISDVAWSSDSNLLVSASDDKT 60 70 80 90 100 110 200 210 220 230 240 250 pF1KE4 LKLFDYSKPSAKRAFKYIQEAEMLRSISFHPSGDFILVGTQHPTLRLYDINTFQCFVSCN ::..: :. . ...: ..... .:.:....:. :. ..:..:..: .:. . CCDS69 LKIWDVSSGKCLKTLK--GHSNYVFCCNFNPQSNLIVSGSFDESVRIWDVKTGKCLKTL- 120 130 140 150 160 260 270 280 290 300 310 pF1KE4 PQDQHTDAICSVNYNSSANMYVTGSKDGCIKLWDGVSNRCITTFEKAHDGAEVCSAIFSK : :.: . .:..: .... :..: :: ..:: .:..:. :. :. : . :: CCDS69 PA--HSDPVSAVHFNRDGSLIVSSSYDGLCRIWDTASGQCLKTL-IDDDNPPVSFVKFSP 170 180 190 200 210 220 320 330 340 350 360 370 pF1KE4 NSKYILSSGKDSVAKLWEISTGRTLVRYTGAGLSGRQVHRTQAVFNHTEDYVLLPDERTI :.::::.. :.. :::. : :. : ::: . . : :. : .. . CCDS69 NGKYILAATLDNTLKLWDYSKGKCLKTYTG---HKNEKYCIFANFSVTGGKWIVSGSEDN 230 240 250 260 270 280 380 390 400 410 420 430 pF1KE4 SLCCWDSRTAERRNLLSLGHNNIVRCIVHSPTNPGFMTCS--DDFRARFWYRRSTTD . :. .: : . :. ::...: . ::. . . . .: ..: CCDS69 LVYIWNLQTKEIVQKLQ-GHTDVVISTACHPTENIIASAALENDKTIKLWKSDC 290 300 310 320 330 431 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sun Nov 6 00:42:49 2016 done: Sun Nov 6 00:42:49 2016 Total Scan time: 2.720 Total Display time: 0.010 Function used was FASTA [36.3.4 Apr, 2011]