FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE2470, 474 aa 1>>>pF1KE2470 474 - 474 aa - 474 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 9.9808+/-0.00109; mu= 0.5614+/- 0.067 mean_var=388.6641+/-80.380, 0's: 0 Z-trim(115.2): 49 B-trim: 0 in 0/51 Lambda= 0.065056 statistics sampled from 15754 (15800) to 15754 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.773), E-opt: 0.2 (0.485), width: 16 Scan time: 4.040 The best scores are: opt bits E(32554) CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 ( 474) 3123 306.9 2.9e-83 CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 ( 441) 718 81.2 2.5e-15 CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 ( 315) 647 74.3 2e-13 >>CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 (474 aa) initn: 3123 init1: 3123 opt: 3123 Z-score: 1608.3 bits: 306.9 E(32554): 2.9e-83 Smith-Waterman score: 3123; 100.0% identity (100.0% similar) in 474 aa overlap (1-474:1-474) 10 20 30 40 50 60 pF1KE2 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE2 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE2 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE2 GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASFAAE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASFAAE 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE2 QAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYLFG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 QAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYLFG 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE2 GLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPADHRG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 GLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPADHRG 310 320 330 340 350 360 370 380 390 400 410 420 pF1KE2 YASLRAASPAPSSAPSHASSSASSHSSSSSSSGSSSSDDEFEDDLLDLNPSSNFESMSLG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 YASLRAASPAPSSAPSHASSSASSHSSSSSSSGSSSSDDEFEDDLLDLNPSSNFESMSLG 370 380 390 400 410 420 430 440 450 460 470 pF1KE2 SFSSSSALDRDLDFNFEPGSGSHFEFPDYCTPEVSEMISGDWLESSISNLVFTY :::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 SFSSSSALDRDLDFNFEPGSGSHFEFPDYCTPEVSEMISGDWLESSISNLVFTY 430 440 450 460 470 >>CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 (441 aa) initn: 1098 init1: 628 opt: 718 Z-score: 388.7 bits: 81.2 E(32554): 2.5e-15 Smith-Waterman score: 1010; 43.3% identity (64.9% similar) in 490 aa overlap (1-474:1-441) 10 20 30 40 50 60 pF1KE2 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK ::::... : .:. : :. :. : :. .: ::. . . ::.:::: ::::: CCDS16 MVQQAESLE-AESNLPREALDTEEG-EF-MACSPVALDES-------DPDWCKTASGHIK 10 20 30 40 50 70 80 90 100 110 120 pF1KE2 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM ::::::::::.:::::::::::::::::::::::::::.::::.:::::::::::::::: CCDS16 RPMNAFMVWSKIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHM 60 70 80 90 100 110 130 140 150 160 170 180 pF1KE2 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG ::::::::::::: : . :.. .::..: .: ...:::: .:::.::.... :. CCDS16 ADYPDYKYRPRKKPK---MDPSAKPSASQSP----EKSAAGGGGGSAGGGAGGAKTSKGS 120 130 140 150 160 190 200 210 220 230 pF1KE2 GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHA--KLILAGGGGGGKAAAAAAASFA . . : . . : . :. . .: ::.. .. .: ..:.:::: :. .. : CCDS16 SKKCGKLKAPAAAGAKAGAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGG-AGKTVKCVFL 170 180 190 200 210 220 240 250 260 270 280 290 pF1KE2 AEQAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYL :. :. .: . :. . . : ::.. ... : : CCDS16 DEDDDD------DDDDDELQLQIKQEPD---EEDEEPPHQQLLQPPGQQ--PSQLLRRYN 230 240 250 260 270 300 310 320 330 340 350 pF1KE2 FGGLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPADH . . .::. ....:. . .::.: :: :.: :.: : . CCDS16 VAKV--PASPT--LSSSAESPEGASLYDEVRAG--------------ATSGAGGGSRL-Y 280 290 300 310 360 370 380 390 400 410 pF1KE2 RGYASLRAASPAPSSAP--SHASSSASSHSSSSSSSGSSSSDDEFEDDL---LDLNPSSN .. .. : : . : : ::: . : ::::::..::.:. : ::: :.:: :.. CCDS16 YSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDADDLMFDLSLNFSQS 320 330 340 350 360 370 420 430 440 450 460 pF1KE2 FESMS---LGSFSSSSAL-----DRDLDFNFEPGS-GSHFEFPDYCTPEVSEMISGDWLE .: : ::. .... : :.::: .: :: :::::::::::::.::::.::::: CCDS16 AHSASEQQLGGGAAAGNLSLSLVDKDLD-SFSEGSLGSHFEFPDYCTPELSEMIAGDWLE 380 390 400 410 420 430 470 pF1KE2 SSISNLVFTY ...:.::::: CCDS16 ANFSDLVFTY 440 >>CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 (315 aa) initn: 862 init1: 580 opt: 647 Z-score: 354.4 bits: 74.3 E(32554): 2e-13 Smith-Waterman score: 679; 38.5% identity (50.2% similar) in 442 aa overlap (34-474:16-315) 10 20 30 40 50 60 pF1KE2 QTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIKRPM : :: . : : .:.::::::::::::: CCDS12 MVQQRGARAKRDGGPPPPGPGPAEEG-AREPGWCKTPSGHIKRPM 10 20 30 40 70 80 90 100 110 120 pF1KE2 NAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHMADY :::::::: ::::::.: :::::::::::::.::.::.::.::::.:::::::::::::: CCDS12 NAFMVWSQHERRKIMDQWPDMHNAEISKRLGRRWQLLQDSEKIPFVREAERLRLKHMADY 50 60 70 80 90 100 130 140 150 160 170 180 pF1KE2 PDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGGGGG :::::::::: :..: :...: : CCDS12 PDYKYRPRKK--------SKGAPAKARPRPPG---------------------------- 110 120 190 200 210 220 230 240 pF1KE2 ASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASFAAEQAG .::::. :: : .. : :: . ::: :: ::: . CCDS12 GSGGGSRLKP------GPQLPG-RGGRRA--------AGGPLGGGAAAPEDDD------- 130 140 150 160 250 260 270 280 290 300 pF1KE2 AAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYLFGGLG : . : ..: . .::..: :. CCDS12 ---------EDDDEELLEVRL----------------VETPGREL-----WRMV------ 170 180 190 310 320 330 340 350 360 pF1KE2 TSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPADHRGYAS :.: .. : :.. : :: : ..:: CCDS12 ----PAGRAARGQ---------AERAQG-----PSGEGAAAAA----------------- 200 210 370 380 390 400 410 420 pF1KE2 LRAASPAPSSAPSHASSSASSHSSSSSSSGSSSSDDEFEDDLLDLNPSSNFESMSLGSFS ::::.:: . .. . . .: .: : : :. .: CCDS12 --AASPTPSEDEEPEEEEEEAAAAEEGEEETVASGEESLGFLSRLPPGPA----GL---- 220 230 240 250 260 430 440 450 460 470 pF1KE2 SSSALDRDLDFNFEPGSG-SHFEFPDYCTPEVSEMISGDWLESSISNLVFTY . :::::: : ..: :: :::::::::::::.:::.::: :::..::::: CCDS12 DCSALDRDPD--LQPPSGTSHFEFPDYCTPEVTEMIAGDWRPSSIADLVFTY 270 280 290 300 310 474 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Mon Nov 7 20:29:47 2016 done: Mon Nov 7 20:29:47 2016 Total Scan time: 4.040 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]