FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE5059, 307 aa 1>>>pF1KE5059 307 - 307 aa - 307 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 8.1326+/-0.000881; mu= 7.0877+/- 0.053 mean_var=328.0145+/-68.014, 0's: 0 Z-trim(117.5): 809 B-trim: 0 in 0/52 Lambda= 0.070815 statistics sampled from 17281 (18219) to 17281 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.832), E-opt: 0.2 (0.56), width: 16 Scan time: 2.770 The best scores are: opt bits E(32554) CCDS13006.1 SCRT2 gene_id:85508|Hs108|chr20 ( 307) 2199 237.4 1.1e-62 CCDS6421.1 SCRT1 gene_id:83482|Hs108|chr8 ( 348) 1069 122.0 6.5e-28 CCDS6146.1 SNAI2 gene_id:6591|Hs108|chr8 ( 268) 605 74.4 1e-13 CCDS32505.1 SNAI3 gene_id:333929|Hs108|chr16 ( 292) 578 71.7 7.4e-13 CCDS13423.1 SNAI1 gene_id:6615|Hs108|chr20 ( 264) 519 65.6 4.6e-11 >>CCDS13006.1 SCRT2 gene_id:85508|Hs108|chr20 (307 aa) initn: 2199 init1: 2199 opt: 2199 Z-score: 1239.1 bits: 237.4 E(32554): 1.1e-62 Smith-Waterman score: 2199; 100.0% identity (100.0% similar) in 307 aa overlap (1-307:1-307) 10 20 30 40 50 60 pF1KE5 MPRSFLVKKIKGDGFQCSGVPAPTYHPLETAYVLPGARGPPGDNGYAPHRLPPSSYDADQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 MPRSFLVKKIKGDGFQCSGVPAPTYHPLETAYVLPGARGPPGDNGYAPHRLPPSSYDADQ 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 KPGLELAPAEPAYPPAAPEEYSDPESPQSSLSARYFRGEAAVTDSYSMDAFFISDGRSRR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 KPGLELAPAEPAYPPAAPEEYSDPESPQSSLSARYFRGEAAVTDSYSMDAFFISDGRSRR 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE5 RRGGGGGDAGGSGDAGGAGGRAGRAGAQAGGGHRHACAECGKTYATSSNLSRHKQTHRSL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 RRGGGGGDAGGSGDAGGAGGRAGRAGAQAGGGHRHACAECGKTYATSSNLSRHKQTHRSL 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE5 DSQLARKCPTCGKAYVSMPALAMHLLTHNLRHKCGVCGKAFSRPWLLQGHMRSHTGEKPF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 DSQLARKCPTCGKAYVSMPALAMHLLTHNLRHKCGVCGKAFSRPWLLQGHMRSHTGEKPF 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE5 GCAHCGKAFADRSNLRAHMQTHSAFKHYRCRQCDKSFALKSYLHKHCEAACAKAAEPPPP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GCAHCGKAFADRSNLRAHMQTHSAFKHYRCRQCDKSFALKSYLHKHCEAACAKAAEPPPP 250 260 270 280 290 300 pF1KE5 TPAGPAS ::::::: CCDS13 TPAGPAS >>CCDS6421.1 SCRT1 gene_id:83482|Hs108|chr8 (348 aa) initn: 1310 init1: 994 opt: 1069 Z-score: 614.6 bits: 122.0 E(32554): 6.5e-28 Smith-Waterman score: 1215; 58.5% identity (71.8% similar) in 347 aa overlap (5-305:5-341) 10 20 30 40 50 pF1KE5 MPRSFLVKKIKGDGFQCSGVPAPTYHPLETAYVLPGAR---GPP-GDNGYAPHRLPPSS- :::::.: :.:. . ::.:: :: : : :.:: . ::: CCDS64 MPRSFLVKKVKLDAFSSAD--------LESAY--GRARSDLGAPLHDKGYLSDYVGPSSV 10 20 30 40 50 60 70 80 90 100 pF1KE5 YDADQKPGLELAPA-EPAYPPAAPEEY-------SDPESPQSSLSAR---YFRGEAAVTD ::.: . .: .:. :: : :. : . : .:. :.. :. :.:::.. CCDS64 YDGDAEAALLKGPSPEPMYAAAVRGELGPAAAGSAPPPTPRPELATAAGGYINGDAAVSE 60 70 80 90 100 110 110 120 130 140 pF1KE5 SYSMDAFFISDGRSRRRRGGGG--------------GDAGGSGDAGG------AGGRAG- .:. :::::.::::::. ...: :::::.: ::: :::.: CCDS64 GYAADAFFITDGRSRRKASNAGAAAAPSTASAAAPDGDAGGGGGAGGRSLGSGPGGRGGT 120 130 140 150 160 170 150 160 170 180 190 pF1KE5 RAGA---------QAGGGHRHACAECGKTYATSSNLSRHKQTHRSLDSQLARKCPTCGKA :::: ::.: ::::.::::::::::::::::::::::::::::.::::::. CCDS64 RAGAGTEARAGPGAAGAGGRHACGECGKTYATSSNLSRHKQTHRSLDSQLARRCPTCGKV 180 190 200 210 220 230 200 210 220 230 240 250 pF1KE5 YVSMPALAMHLLTHNLRHKCGVCGKAFSRPWLLQGHMRSHTGEKPFGCAHCGKAFADRSN ::::::.:::::::.::::::::::::::::::::::::::::::::::::::::::::: CCDS64 YVSMPAMAMHLLTHDLRHKCGVCGKAFSRPWLLQGHMRSHTGEKPFGCAHCGKAFADRSN 240 250 260 270 280 290 260 270 280 290 300 pF1KE5 LRAHMQTHSAFKHYRCRQCDKSFALKSYLHKHCEAACAKAAEPPPPTPAGPAS :::::::::::::..:..: :::::::::.:: :.:: :.. : .:: : CCDS64 LRAHMQTHSAFKHFQCKRCKKSFALKSYLNKHYESACFKGGAGGPAAPAPPQLSPVQA 300 310 320 330 340 >>CCDS6146.1 SNAI2 gene_id:6591|Hs108|chr8 (268 aa) initn: 810 init1: 582 opt: 605 Z-score: 359.6 bits: 74.4 E(32554): 1e-13 Smith-Waterman score: 614; 39.5% identity (59.9% similar) in 299 aa overlap (1-294:1-267) 10 20 30 40 50 60 pF1KE5 MPRSFLVKKIKGDGFQCSGVPAPTYHPLETAYVLPGARGPPGDNGYAPHRLPPSSYDADQ ::::::::: :. : :.: :.: :. . : ..:. .: : CCDS61 MPRSFLVKK----HFNAS--KKPNYSELDTHTVIIS---PYLYESYSMPVIP-------Q 10 20 30 40 70 80 90 100 110 pF1KE5 KPGLELAPAEP--AYPPAAPEEYSDPE--SPQSSLSARYFR-GEAAVTDSYSMDAFFISD : . : .. ::: . . :. :: :. :. : . .:. : : CCDS61 PEILSSGAYSPITVWTTAAPFHAQLPNGLSPLSGYSSSLGRVSPPPPSDTSSKDH----- 50 60 70 80 90 120 130 140 150 160 170 pF1KE5 GRSRRRRGGGGGDAGGSGDAGGAGGRAGRAGAQAGGGHRHACAECGKTYATSSNLSRHKQ .:... : . .. . .: ... : :.:::.: :.:..::: CCDS61 ---------SGSESPISDEEERLQSKL--SDPHAIEAEKFQCNLCNKTYSTFSGLAKHKQ 100 110 120 130 140 180 190 200 210 220 230 pF1KE5 THRSLDSQLARKCPTCGKAYVSMPALAMHLLTHNLRHKCGVCGKAFSRPWLLQGHMRSHT : . .:. . .: : : :::. :: ::. ::.: : .::::::::::::::.:.:: CCDS61 LHCDAQSRKSFSCKYCDKEYVSLGALKMHIRTHTLPCVCKICGKAFSRPWLLQGHIRTHT 150 160 170 180 190 200 240 250 260 270 280 290 pF1KE5 GEKPFGCAHCGKAFADRSNLRAHMQTHSAFKHYRCRQCDKSFALKSYLHKHCEAACAKAA :::::.: ::..:::::::::::.:::: :.:.:..:.:.:. : :::: :..: : CCDS61 GEKPFSCPHCNRAFADRSNLRAHLQTHSDVKKYQCKNCSKTFSRMSLLHKHEESGCCVAH 210 220 230 240 250 260 300 pF1KE5 EPPPPTPAGPAS >>CCDS32505.1 SNAI3 gene_id:333929|Hs108|chr16 (292 aa) initn: 610 init1: 498 opt: 578 Z-score: 344.3 bits: 71.7 E(32554): 7.4e-13 Smith-Waterman score: 589; 39.4% identity (59.3% similar) in 307 aa overlap (1-291:1-288) 10 20 30 40 50 60 pF1KE5 MPRSFLVKKIKGDGFQCSGVPAPTYHPLETAYVLPGARGPPGDNGYAPHRLPPSSYDADQ :::::::: :. .:.:. ::: . :: . : . .: : : . .: . CCDS32 MPRSFLVKT-------HSSHRVPNYRRLETQREINGACSACG-GLVVP--LLPRDKEAPS 10 20 30 40 50 70 80 90 100 110 pF1KE5 KPGLELAPAEPAYPPAAPEEYSDPESP--QSSLSARYFRG-EAAVTDSYSMDAFFI--SD :: .: : .: .: : : : . .:.: . . :.. .: . : .. .: CCDS32 VPG-DL-P-QPWDRSSAVACISLPLLPRIEEALGASGLDALEVSEVDPRASRAAIVPLKD 60 70 80 90 100 120 130 140 150 160 pF1KE5 GRSR-----------RRRGGGGGDAGGSGDAGGAGGRAGRAGAQAGGGHRHACAECGKTY . .. : : : :. . .. : :: :: . : .: : : CCDS32 SLNHLNLPPLLVLPTRWSPTLGPDRHGAPEKLLGAERMPRA----PGGFE--CFHCHKPY 110 120 130 140 150 160 170 180 190 200 210 220 pF1KE5 ATSSNLSRHKQTHRSLDSQLARKCPTCGKAYVSMPALAMHLLTHNLRHKCGVCGKAFSRP : ..:.::.: : :. . : : : :.:. :: ::. ::.: : .:::::::: CCDS32 HTLAGLARHRQLHCHLQVGRVFTCKYCDKEYTSLGALKMHIRTHTLPCTCKICGKAFSRP 170 180 190 200 210 220 230 240 250 260 270 280 pF1KE5 WLLQGHMRSHTGEKPFGCAHCGKAFADRSNLRAHMQTHSAFKHYRCRQCDKSFALKSYLH ::::::.:.::::::..:.::..:::::::::::.:::: :.::::.: :.:. : : CCDS32 WLLQGHVRTHTGEKPYACSHCSRAFADRSNLRAHLQTHSDAKKYRCRRCTKTFSRMSLLA 230 240 250 260 270 280 290 300 pF1KE5 KHCEAACAKAAEPPPPTPAGPAS .: :..: CCDS32 RHEESGCCPGP 290 >>CCDS13423.1 SNAI1 gene_id:6615|Hs108|chr20 (264 aa) initn: 669 init1: 467 opt: 519 Z-score: 312.1 bits: 65.6 E(32554): 4.6e-11 Smith-Waterman score: 519; 58.3% identity (82.6% similar) in 115 aa overlap (178-292:146-260) 150 160 170 180 190 200 pF1KE5 QAGGGHRHACAECGKTYATSSNLSRHKQTHRSLDSQLARKCPTCGKAYVSMPALAMHLLT ..:... : .: :.: :.:. :: ::. . CCDS13 TSVSSLEAEAYAAFPGLGQVPKQLAQLSEAKDLQARKAFNCKYCNKEYLSLGALKMHIRS 120 130 140 150 160 170 210 220 230 240 250 260 pF1KE5 HNLRHKCGVCGKAFSRPWLLQGHMRSHTGEKPFGCAHCGKAFADRSNLRAHMQTHSAFKH :.: ::.::::::::::::::.:.:::::::.: ::..:::::::::::.:::: :. CCDS13 HTLPCVCGTCGKAFSRPWLLQGHVRTHTGEKPFSCPHCSRAFADRSNLRAHLQTHSDVKK 180 190 200 210 220 230 270 280 290 300 pF1KE5 YRCRQCDKSFALKSYLHKHCEAACAKAAEPPPPTPAGPAS :.:. : ..:. : :::: :..:. CCDS13 YQCQACARTFSRMSLLHKHQESGCSGCPR 240 250 260 307 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Tue Nov 8 04:37:51 2016 done: Tue Nov 8 04:37:52 2016 Total Scan time: 2.770 Total Display time: -0.020 Function used was FASTA [36.3.4 Apr, 2011]