FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KE2470, 474 aa
1>>>pF1KE2470 474 - 474 aa - 474 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 9.9808+/-0.00109; mu= 0.5614+/- 0.067
mean_var=388.6641+/-80.380, 0's: 0 Z-trim(115.2): 49 B-trim: 0 in 0/51
Lambda= 0.065056
statistics sampled from 15754 (15800) to 15754 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.773), E-opt: 0.2 (0.485), width: 16
Scan time: 4.040
The best scores are: opt bits E(32554)
CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 ( 474) 3123 306.9 2.9e-83
CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 ( 441) 718 81.2 2.5e-15
CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 ( 315) 647 74.3 2e-13
>>CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 (474 aa)
initn: 3123 init1: 3123 opt: 3123 Z-score: 1608.3 bits: 306.9 E(32554): 2.9e-83
Smith-Waterman score: 3123; 100.0% identity (100.0% similar) in 474 aa overlap (1-474:1-474)
10 20 30 40 50 60
pF1KE2 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS45 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK
10 20 30 40 50 60
70 80 90 100 110 120
pF1KE2 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS45 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM
70 80 90 100 110 120
130 140 150 160 170 180
pF1KE2 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS45 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG
130 140 150 160 170 180
190 200 210 220 230 240
pF1KE2 GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASFAAE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS45 GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASFAAE
190 200 210 220 230 240
250 260 270 280 290 300
pF1KE2 QAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYLFG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS45 QAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYLFG
250 260 270 280 290 300
310 320 330 340 350 360
pF1KE2 GLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPADHRG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS45 GLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPADHRG
310 320 330 340 350 360
370 380 390 400 410 420
pF1KE2 YASLRAASPAPSSAPSHASSSASSHSSSSSSSGSSSSDDEFEDDLLDLNPSSNFESMSLG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS45 YASLRAASPAPSSAPSHASSSASSHSSSSSSSGSSSSDDEFEDDLLDLNPSSNFESMSLG
370 380 390 400 410 420
430 440 450 460 470
pF1KE2 SFSSSSALDRDLDFNFEPGSGSHFEFPDYCTPEVSEMISGDWLESSISNLVFTY
::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS45 SFSSSSALDRDLDFNFEPGSGSHFEFPDYCTPEVSEMISGDWLESSISNLVFTY
430 440 450 460 470
>>CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 (441 aa)
initn: 1098 init1: 628 opt: 718 Z-score: 388.7 bits: 81.2 E(32554): 2.5e-15
Smith-Waterman score: 1010; 43.3% identity (64.9% similar) in 490 aa overlap (1-474:1-441)
10 20 30 40 50 60
pF1KE2 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK
::::... : .:. : :. :. : :. .: ::. . . ::.:::: :::::
CCDS16 MVQQAESLE-AESNLPREALDTEEG-EF-MACSPVALDES-------DPDWCKTASGHIK
10 20 30 40 50
70 80 90 100 110 120
pF1KE2 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM
::::::::::.:::::::::::::::::::::::::::.::::.::::::::::::::::
CCDS16 RPMNAFMVWSKIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHM
60 70 80 90 100 110
130 140 150 160 170 180
pF1KE2 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG
::::::::::::: : . :.. .::..: .: ...:::: .:::.::.... :.
CCDS16 ADYPDYKYRPRKKPK---MDPSAKPSASQSP----EKSAAGGGGGSAGGGAGGAKTSKGS
120 130 140 150 160
190 200 210 220 230
pF1KE2 GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHA--KLILAGGGGGGKAAAAAAASFA
. . : . . : . :. . .: ::.. .. .: ..:.:::: :. .. :
CCDS16 SKKCGKLKAPAAAGAKAGAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGG-AGKTVKCVFL
170 180 190 200 210 220
240 250 260 270 280 290
pF1KE2 AEQAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYL
:. :. .: . :. . . : ::.. ... : :
CCDS16 DEDDDD------DDDDDELQLQIKQEPD---EEDEEPPHQQLLQPPGQQ--PSQLLRRYN
230 240 250 260 270
300 310 320 330 340 350
pF1KE2 FGGLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPADH
. . .::. ....:. . .::.: :: :.: :.: : .
CCDS16 VAKV--PASPT--LSSSAESPEGASLYDEVRAG--------------ATSGAGGGSRL-Y
280 290 300 310
360 370 380 390 400 410
pF1KE2 RGYASLRAASPAPSSAP--SHASSSASSHSSSSSSSGSSSSDDEFEDDL---LDLNPSSN
.. .. : : . : : ::: . : ::::::..::.:. : ::: :.:: :..
CCDS16 YSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDADDLMFDLSLNFSQS
320 330 340 350 360 370
420 430 440 450 460
pF1KE2 FESMS---LGSFSSSSAL-----DRDLDFNFEPGS-GSHFEFPDYCTPEVSEMISGDWLE
.: : ::. .... : :.::: .: :: :::::::::::::.::::.:::::
CCDS16 AHSASEQQLGGGAAAGNLSLSLVDKDLD-SFSEGSLGSHFEFPDYCTPELSEMIAGDWLE
380 390 400 410 420 430
470
pF1KE2 SSISNLVFTY
...:.:::::
CCDS16 ANFSDLVFTY
440
>>CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 (315 aa)
initn: 862 init1: 580 opt: 647 Z-score: 354.4 bits: 74.3 E(32554): 2e-13
Smith-Waterman score: 679; 38.5% identity (50.2% similar) in 442 aa overlap (34-474:16-315)
10 20 30 40 50 60
pF1KE2 QTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIKRPM
: :: . : : .:.:::::::::::::
CCDS12 MVQQRGARAKRDGGPPPPGPGPAEEG-AREPGWCKTPSGHIKRPM
10 20 30 40
70 80 90 100 110 120
pF1KE2 NAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHMADY
:::::::: ::::::.: :::::::::::::.::.::.::.::::.::::::::::::::
CCDS12 NAFMVWSQHERRKIMDQWPDMHNAEISKRLGRRWQLLQDSEKIPFVREAERLRLKHMADY
50 60 70 80 90 100
130 140 150 160 170 180
pF1KE2 PDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGGGGG
:::::::::: :..: :...: :
CCDS12 PDYKYRPRKK--------SKGAPAKARPRPPG----------------------------
110 120
190 200 210 220 230 240
pF1KE2 ASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASFAAEQAG
.::::. :: : .. : :: . ::: :: ::: .
CCDS12 GSGGGSRLKP------GPQLPG-RGGRRA--------AGGPLGGGAAAPEDDD-------
130 140 150 160
250 260 270 280 290 300
pF1KE2 AAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYLFGGLG
: . : ..: . .::..: :.
CCDS12 ---------EDDDEELLEVRL----------------VETPGREL-----WRMV------
170 180 190
310 320 330 340 350 360
pF1KE2 TSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPADHRGYAS
:.: .. : :.. : :: : ..::
CCDS12 ----PAGRAARGQ---------AERAQG-----PSGEGAAAAA-----------------
200 210
370 380 390 400 410 420
pF1KE2 LRAASPAPSSAPSHASSSASSHSSSSSSSGSSSSDDEFEDDLLDLNPSSNFESMSLGSFS
::::.:: . .. . . .: .: : : :. .:
CCDS12 --AASPTPSEDEEPEEEEEEAAAAEEGEEETVASGEESLGFLSRLPPGPA----GL----
220 230 240 250 260
430 440 450 460 470
pF1KE2 SSSALDRDLDFNFEPGSG-SHFEFPDYCTPEVSEMISGDWLESSISNLVFTY
. :::::: : ..: :: :::::::::::::.:::.::: :::..:::::
CCDS12 DCSALDRDPD--LQPPSGTSHFEFPDYCTPEVTEMIAGDWRPSSIADLVFTY
270 280 290 300 310
474 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Mon Nov 7 20:29:47 2016 done: Mon Nov 7 20:29:47 2016
Total Scan time: 4.040 Total Display time: 0.000
Function used was FASTA [36.3.4 Apr, 2011]