FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE3967, 404 aa 1>>>pF1KE3967 404 - 404 aa - 404 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 6.7745+/-0.000885; mu= 9.6136+/- 0.052 mean_var=123.1531+/-25.484, 0's: 0 Z-trim(109.5): 198 B-trim: 23 in 1/50 Lambda= 0.115572 statistics sampled from 10710 (10959) to 10710 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.708), E-opt: 0.2 (0.337), width: 16 Scan time: 2.480 The best scores are: opt bits E(32554) CCDS9186.1 WSB2 gene_id:55884|Hs108|chr12 ( 404) 2819 481.4 6.6e-136 CCDS61252.1 WSB2 gene_id:55884|Hs108|chr12 ( 421) 2794 477.2 1.2e-134 CCDS61251.1 WSB2 gene_id:55884|Hs108|chr12 ( 194) 1313 230.0 1.4e-60 CCDS11220.1 WSB1 gene_id:26118|Hs108|chr17 ( 421) 1177 207.6 1.8e-53 CCDS11221.1 WSB1 gene_id:26118|Hs108|chr17 ( 275) 1007 179.1 4.3e-45 CCDS6981.1 WDR5 gene_id:11091|Hs108|chr9 ( 334) 358 71.0 1.9e-12 >>CCDS9186.1 WSB2 gene_id:55884|Hs108|chr12 (404 aa) initn: 2819 init1: 2819 opt: 2819 Z-score: 2553.4 bits: 481.4 E(32554): 6.6e-136 Smith-Waterman score: 2819; 100.0% identity (100.0% similar) in 404 aa overlap (1-404:1-404) 10 20 30 40 50 60 pF1KE3 MEAGEEPLLLAELKPGRPHQFDWKSSCETWSVAFSPDGSWFAWSQGHCIVKLIPWPLEEQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS91 MEAGEEPLLLAELKPGRPHQFDWKSSCETWSVAFSPDGSWFAWSQGHCIVKLIPWPLEEQ 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE3 FIPKGFEAKSRSSKNETKGRGSPKEKTLDCGQIVWGLAFSPWPSPPSRKLWARHHPQVPD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS91 FIPKGFEAKSRSSKNETKGRGSPKEKTLDCGQIVWGLAFSPWPSPPSRKLWARHHPQVPD 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE3 VSCLVLATGLNDGQIKIWEVQTGLLLLNLSGHQDVVRDLSFTPSGSLILVSASRDKTLRI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS91 VSCLVLATGLNDGQIKIWEVQTGLLLLNLSGHQDVVRDLSFTPSGSLILVSASRDKTLRI 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE3 WDLNKHGKQIQVLSGHLQWVYCCSISPDCSMLCSAAGEKSVFLWSMRSYTLIRKLEGHQS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS91 WDLNKHGKQIQVLSGHLQWVYCCSISPDCSMLCSAAGEKSVFLWSMRSYTLIRKLEGHQS 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE3 SVVSCDFSPDSALLVTASYDTNVIMWDPYTGERLRSLHHTQVDPAMDDSDVHISSLRSVC :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS91 SVVSCDFSPDSALLVTASYDTNVIMWDPYTGERLRSLHHTQVDPAMDDSDVHISSLRSVC 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE3 FSPEGLYLATVADDRLLRIWALELKTPIAFAPMTNGLCCTFFPHGGVIATGTRDGHVQFW :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS91 FSPEGLYLATVADDRLLRIWALELKTPIAFAPMTNGLCCTFFPHGGVIATGTRDGHVQFW 310 320 330 340 350 360 370 380 390 400 pF1KE3 TAPRVLSSLKHLCRKALRSFLTTYQVLALPIPKKMKEFLTYRTF :::::::::::::::::::::::::::::::::::::::::::: CCDS91 TAPRVLSSLKHLCRKALRSFLTTYQVLALPIPKKMKEFLTYRTF 370 380 390 400 >>CCDS61252.1 WSB2 gene_id:55884|Hs108|chr12 (421 aa) initn: 2794 init1: 2794 opt: 2794 Z-score: 2530.6 bits: 477.2 E(32554): 1.2e-134 Smith-Waterman score: 2794; 99.8% identity (99.8% similar) in 402 aa overlap (3-404:20-421) 10 20 30 40 pF1KE3 MEAGEEPLLLAELKPGRPHQFDWKSSCETWSVAFSPDGSWFAW : ::::::::::::::::::::::::::::::::::::::: CCDS61 MRVDRESRFLRGTGTGEAVAVEEPLLLAELKPGRPHQFDWKSSCETWSVAFSPDGSWFAW 10 20 30 40 50 60 50 60 70 80 90 100 pF1KE3 SQGHCIVKLIPWPLEEQFIPKGFEAKSRSSKNETKGRGSPKEKTLDCGQIVWGLAFSPWP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS61 SQGHCIVKLIPWPLEEQFIPKGFEAKSRSSKNETKGRGSPKEKTLDCGQIVWGLAFSPWP 70 80 90 100 110 120 110 120 130 140 150 160 pF1KE3 SPPSRKLWARHHPQVPDVSCLVLATGLNDGQIKIWEVQTGLLLLNLSGHQDVVRDLSFTP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS61 SPPSRKLWARHHPQVPDVSCLVLATGLNDGQIKIWEVQTGLLLLNLSGHQDVVRDLSFTP 130 140 150 160 170 180 170 180 190 200 210 220 pF1KE3 SGSLILVSASRDKTLRIWDLNKHGKQIQVLSGHLQWVYCCSISPDCSMLCSAAGEKSVFL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS61 SGSLILVSASRDKTLRIWDLNKHGKQIQVLSGHLQWVYCCSISPDCSMLCSAAGEKSVFL 190 200 210 220 230 240 230 240 250 260 270 280 pF1KE3 WSMRSYTLIRKLEGHQSSVVSCDFSPDSALLVTASYDTNVIMWDPYTGERLRSLHHTQVD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS61 WSMRSYTLIRKLEGHQSSVVSCDFSPDSALLVTASYDTNVIMWDPYTGERLRSLHHTQVD 250 260 270 280 290 300 290 300 310 320 330 340 pF1KE3 PAMDDSDVHISSLRSVCFSPEGLYLATVADDRLLRIWALELKTPIAFAPMTNGLCCTFFP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS61 PAMDDSDVHISSLRSVCFSPEGLYLATVADDRLLRIWALELKTPIAFAPMTNGLCCTFFP 310 320 330 340 350 360 350 360 370 380 390 400 pF1KE3 HGGVIATGTRDGHVQFWTAPRVLSSLKHLCRKALRSFLTTYQVLALPIPKKMKEFLTYRT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS61 HGGVIATGTRDGHVQFWTAPRVLSSLKHLCRKALRSFLTTYQVLALPIPKKMKEFLTYRT 370 380 390 400 410 420 pF1KE3 F : CCDS61 F >>CCDS61251.1 WSB2 gene_id:55884|Hs108|chr12 (194 aa) initn: 1313 init1: 1313 opt: 1313 Z-score: 1200.8 bits: 230.0 E(32554): 1.4e-60 Smith-Waterman score: 1313; 100.0% identity (100.0% similar) in 194 aa overlap (211-404:1-194) 190 200 210 220 230 240 pF1KE3 WDLNKHGKQIQVLSGHLQWVYCCSISPDCSMLCSAAGEKSVFLWSMRSYTLIRKLEGHQS :::::::::::::::::::::::::::::: CCDS61 MLCSAAGEKSVFLWSMRSYTLIRKLEGHQS 10 20 30 250 260 270 280 290 300 pF1KE3 SVVSCDFSPDSALLVTASYDTNVIMWDPYTGERLRSLHHTQVDPAMDDSDVHISSLRSVC :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS61 SVVSCDFSPDSALLVTASYDTNVIMWDPYTGERLRSLHHTQVDPAMDDSDVHISSLRSVC 40 50 60 70 80 90 310 320 330 340 350 360 pF1KE3 FSPEGLYLATVADDRLLRIWALELKTPIAFAPMTNGLCCTFFPHGGVIATGTRDGHVQFW :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS61 FSPEGLYLATVADDRLLRIWALELKTPIAFAPMTNGLCCTFFPHGGVIATGTRDGHVQFW 100 110 120 130 140 150 370 380 390 400 pF1KE3 TAPRVLSSLKHLCRKALRSFLTTYQVLALPIPKKMKEFLTYRTF :::::::::::::::::::::::::::::::::::::::::::: CCDS61 TAPRVLSSLKHLCRKALRSFLTTYQVLALPIPKKMKEFLTYRTF 160 170 180 190 >>CCDS11220.1 WSB1 gene_id:26118|Hs108|chr17 (421 aa) initn: 1417 init1: 714 opt: 1177 Z-score: 1073.5 bits: 207.6 E(32554): 1.8e-53 Smith-Waterman score: 1356; 51.5% identity (74.8% similar) in 400 aa overlap (13-402:23-420) 10 20 30 40 50 pF1KE3 MEAGEEPLLLAELKPGRPHQFDWKSSCETWSVAFSPDGSWFAWSQGHCIV : :. : :: : . :.:.:::.::::.::::::: : CCDS11 MASFPPRVNEKEIVRLRTIGELLAPAAP--FDKKCGRENWTVAFAPDGSYFAWSQGHRTV 10 20 30 40 50 60 70 80 90 100 pF1KE3 KLIPWPLEEQ-FIPKGFEAKSRSS------KNETKG-RGSPKEKTLDCGQIVWGLAF-SP ::.:: : :. .: . . :: .: : ...:.:. .:::.:::.::: : CCDS11 KLVPWSQCLQNFLLHGTKNVTNSSSLRLPRQNSDGGQKNKPREHIIDCGDIVWSLAFGSS 60 70 80 90 100 110 110 120 130 140 150 160 pF1KE3 WPSPPSRKLWARHHPQVPDVSCLVLATGLNDGQIKIWEVQTGLLLLNLSGHQDVVRDLSF : :: . . : . :.::::::.:.::::.: :: ::::: : .:::::.: CCDS11 VPEKQSRCVNIEWHRFRFGQDQLLLATGLNNGRIKIWDVYTGKLLLNLVDHTEVVRDLTF 120 130 140 150 160 170 170 180 190 200 210 220 pF1KE3 TPSGSLILVSASRDKTLRIWDLNKHGKQIQVLSGHLQWVYCCSISPDCSMLCSAAGEKSV .:.:::::::::::::::.:::. :....:: :: .::: :..::: :::::... :.: CCDS11 APDGSLILVSASRDKTLRVWDLKDDGNMMKVLRGHQNWVYSCAFSPDSSMLCSVGASKAV 180 190 200 210 220 230 230 240 250 260 270 280 pF1KE3 FLWSMRSYTLIRKLEGHQSSVVSCDFSPDSALLVTASYDTNVIMWDPYTGERLRSLHHTQ :::.: .::.:::::::. .::.::::::.:::.:::::: : .:::..:. : . : CCDS11 FLWNMDKYTMIRKLEGHHHDVVACDFSPDGALLATASYDTRVYIWDPHNGDILMEFGHLF 240 250 260 270 280 290 290 300 310 320 330 340 pF1KE3 VDPA-MDDSDVHISSLRSVCFSPEGLYLATVADDRLLRIWALELKTPIAFAPMTNGLCCT :. . . .. .::: :: .::..:..:::...:.: .. :. ::..:::::. CCDS11 PPPTPIFAGGANDRWVRSVSFSHDGLHVASLADDKMVRFWRIDEDYPVQVAPLSNGLCCA 300 310 320 330 340 350 350 360 370 380 390 400 pF1KE3 FFPHGGVIATGTRDGHVQFWTAPRVLSSLKHLCRKALRSFLTTYQVLALPIPKKMKEFLT : :.:.:.::.:: : ::..:: . ::.:::: ..: . : .: ::::.:. :::. CCDS11 FSTDGSVLAAGTHDGSVYFWATPRQVPSLQHLCRMSIRRVMPTQEVQELPIPSKLLEFLS 360 370 380 390 400 410 pF1KE3 YRTF :: CCDS11 YRI 420 >>CCDS11221.1 WSB1 gene_id:26118|Hs108|chr17 (275 aa) initn: 1078 init1: 615 opt: 1007 Z-score: 923.0 bits: 179.1 E(32554): 4.3e-45 Smith-Waterman score: 1007; 53.4% identity (79.1% similar) in 268 aa overlap (136-402:7-274) 110 120 130 140 150 160 pF1KE3 PSRKLWARHHPQVPDVSCLVLATGLNDGQIKIWEVQTGLLLLNLSGHQDVVRDLSFTPSG .. : . : ::::: : .:::::.:.:.: CCDS11 MASFPPRVNEKEIGKLLLNLVDHTEVVRDLTFAPDG 10 20 30 170 180 190 200 210 220 pF1KE3 SLILVSASRDKTLRIWDLNKHGKQIQVLSGHLQWVYCCSISPDCSMLCSAAGEKSVFLWS ::::::::::::::.:::. :....:: :: .::: :..::: :::::... :.::::. CCDS11 SLILVSASRDKTLRVWDLKDDGNMMKVLRGHQNWVYSCAFSPDSSMLCSVGASKAVFLWN 40 50 60 70 80 90 230 240 250 260 270 280 pF1KE3 MRSYTLIRKLEGHQSSVVSCDFSPDSALLVTASYDTNVIMWDPYTGERLRSLHHTQVDPA : .::.:::::::. .::.::::::.:::.:::::: : .:::..:. : . : :. CCDS11 MDKYTMIRKLEGHHHDVVACDFSPDGALLATASYDTRVYIWDPHNGDILMEFGHLFPPPT 100 110 120 130 140 150 290 300 310 320 330 340 pF1KE3 -MDDSDVHISSLRSVCFSPEGLYLATVADDRLLRIWALELKTPIAFAPMTNGLCCTFFPH . . .. .::: :: .::..:..:::...:.: .. :. ::..:::::.: CCDS11 PIFAGGANDRWVRSVSFSHDGLHVASLADDKMVRFWRIDEDYPVQVAPLSNGLCCAFSTD 160 170 180 190 200 210 350 360 370 380 390 400 pF1KE3 GGVIATGTRDGHVQFWTAPRVLSSLKHLCRKALRSFLTTYQVLALPIPKKMKEFLTYRTF :.:.:.::.:: : ::..:: . ::.:::: ..: . : .: ::::.:. :::.:: CCDS11 GSVLAAGTHDGSVYFWATPRQVPSLQHLCRMSIRRVMPTQEVQELPIPSKLLEFLSYRI 220 230 240 250 260 270 >>CCDS6981.1 WDR5 gene_id:11091|Hs108|chr9 (334 aa) initn: 356 init1: 221 opt: 358 Z-score: 336.9 bits: 71.0 E(32554): 1.9e-12 Smith-Waterman score: 424; 30.5% identity (61.0% similar) in 272 aa overlap (101-360:28-286) 80 90 100 110 120 pF1KE3 RSSKNETKGRGSPKEKTLDCGQIVWGLAFSPWPSPPSRKL---WARHHPQVPDVSCLV-- : : :. : : : : .:. CCDS69 MATEEKKPETEAARAQPTPSSSATQSKPTPVKPNYALKFTLAGHTKAVSSVKFSPNG 10 20 30 40 50 130 140 150 160 170 180 pF1KE3 --LATGLNDGQIKIWEVQTGLLLLNLSGHQDVVRDLSFTPSGSLILVSASRDKTLRIWDL ::.. : :::: . : . ..:::. . :.... : : .::::: ::::.:::. CCDS69 EWLASSSADKLIKIWGAYDGKFEKTISGHKLGISDVAWS-SDSNLLVSASDDKTLKIWDV 60 70 80 90 100 110 190 200 210 220 230 240 pF1KE3 NKHGKQIQVLSGHLQWVYCCSISPDCSMLCSAAGEKSVFLWSMRSYTLIRKLEGHQSSVV .. :: ...:.:: ..:.::...:. ... :.. ..:: .:.... .. : .:.. : CCDS69 SS-GKCLKTLKGHSNYVFCCNFNPQSNLIVSGSFDESVRIWDVKTGKCLKTLPAHSDPVS 120 130 140 150 160 170 250 260 270 280 290 300 pF1KE3 SCDFSPDSALLVTASYDTNVIMWDPYTGERLRSLHHTQVDPAMDDSDVHISSLRSVCFSP . :. :..:.:..::: .:: .:. :..: .::.. .: .. ::: CCDS69 AVHFNRDGSLIVSSSYDGLCRIWDTASGQCLKTL--------IDDDNPPVSFVK---FSP 180 190 200 210 220 310 320 330 340 350 pF1KE3 EGLYLATVADDRLLRIWAL-ELKTPIAFAPMTNGLCCTF----FPHGGVIATGTRDGHVQ .: :. ... : :..: . : ... : : : : :..:..:. : CCDS69 NGKYILAATLDNTLKLWDYSKGKCLKTYTGHKNEKYCIFANFSVTGGKWIVSGSEDNLVY 230 240 250 260 270 280 360 370 380 390 400 pF1KE3 FWTAPRVLSSLKHLCRKALRSFLTTYQVLALPIPKKMKEFLTYRTF .: CCDS69 IWNLQTKEIVQKLQGHTDVVISTACHPTENIIASAALENDKTIKLWKSDC 290 300 310 320 330 404 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sun Nov 6 08:29:30 2016 done: Sun Nov 6 08:29:31 2016 Total Scan time: 2.480 Total Display time: 0.010 Function used was FASTA [36.3.4 Apr, 2011]