FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KB9648, 391 aa
1>>>pF1KB9648 391 - 391 aa - 391 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 8.9186+/-0.00101; mu= 4.2540+/- 0.061
mean_var=298.1848+/-61.698, 0's: 0 Z-trim(114.8): 68 B-trim: 0 in 0/50
Lambda= 0.074273
statistics sampled from 15261 (15328) to 15261 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.765), E-opt: 0.2 (0.471), width: 16
Scan time: 3.140
The best scores are: opt bits E(32554)
CCDS9523.1 SOX1 gene_id:6656|Hs108|chr13 ( 391) 2681 300.3 2e-81
CCDS14669.1 SOX3 gene_id:6658|Hs108|chrX ( 446) 833 102.3 9.1e-22
CCDS3239.1 SOX2 gene_id:6657|Hs108|chr3 ( 317) 780 96.5 3.7e-20
CCDS9473.1 SOX21 gene_id:11166|Hs108|chr13 ( 276) 611 78.3 9.6e-15
CCDS3094.1 SOX14 gene_id:8403|Hs108|chr3 ( 240) 602 77.3 1.7e-14
CCDS32549.1 SOX15 gene_id:6665|Hs108|chr17 ( 233) 485 64.7 9.9e-11
>>CCDS9523.1 SOX1 gene_id:6656|Hs108|chr13 (391 aa)
initn: 2681 init1: 2681 opt: 2681 Z-score: 1575.2 bits: 300.3 E(32554): 2e-81
Smith-Waterman score: 2681; 100.0% identity (100.0% similar) in 391 aa overlap (1-391:1-391)
10 20 30 40 50 60
pF1KB9 MYSMMMETDLHSPGGAQAPTNLSGPAGAGGGGGGGGGGGGGGGAKANQDRVKRPMNAFMV
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS95 MYSMMMETDLHSPGGAQAPTNLSGPAGAGGGGGGGGGGGGGGGAKANQDRVKRPMNAFMV
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB9 WSRGQRRKMAQENPKMHNSEISKRLGAEWKVMSEAEKRPFIDEAKRLRALHMKEHPDYKY
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS95 WSRGQRRKMAQENPKMHNSEISKRLGAEWKVMSEAEKRPFIDEAKRLRALHMKEHPDYKY
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB9 RPRRKTKTLLKKDKYSLAGGLLAAGAGGGGAAVAMGVGVGVGAAAVGQRLESPGGAAGGG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS95 RPRRKTKTLLKKDKYSLAGGLLAAGAGGGGAAVAMGVGVGVGAAAVGQRLESPGGAAGGG
130 140 150 160 170 180
190 200 210 220 230 240
pF1KB9 YAHVNGWANGAYPGSVAAAAAAAAMMQEAQLAYGQHPGAGGAHPHAHPAHPHPHHPHAHP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS95 YAHVNGWANGAYPGSVAAAAAAAAMMQEAQLAYGQHPGAGGAHPHAHPAHPHPHHPHAHP
190 200 210 220 230 240
250 260 270 280 290 300
pF1KB9 HNPQPMHRYDMGALQYSPISNSQGYMSASPSGYGGLPYGAAAAAAAAAGGAHQNSAVAAA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS95 HNPQPMHRYDMGALQYSPISNSQGYMSASPSGYGGLPYGAAAAAAAAAGGAHQNSAVAAA
250 260 270 280 290 300
310 320 330 340 350 360
pF1KB9 AAAAAASSGALGALGSLVKSEPSGSPPAPAHSRAPCPGDLREMISMYLPAGEGGDPAAAA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS95 AAAAAASSGALGALGSLVKSEPSGSPPAPAHSRAPCPGDLREMISMYLPAGEGGDPAAAA
310 320 330 340 350 360
370 380 390
pF1KB9 AAAAQSRLHSLPQHYQGAGAGVNGTVPLTHI
:::::::::::::::::::::::::::::::
CCDS95 AAAAQSRLHSLPQHYQGAGAGVNGTVPLTHI
370 380 390
>>CCDS14669.1 SOX3 gene_id:6658|Hs108|chrX (446 aa)
initn: 1125 init1: 724 opt: 833 Z-score: 504.3 bits: 102.3 E(32554): 9.1e-22
Smith-Waterman score: 1326; 58.9% identity (77.1% similar) in 389 aa overlap (12-391:102-446)
10 20 30 40
pF1KB9 MYSMMMETDLHSPGGA-QAPTNLSGPAGAGGGGGGGGGGGG
.:::: .. .: .: :..:::..::..:::
CCDS14 PAPAMYSLLETELKNPVGTPTQAAGTGGPAAPGGAGKSSANAAGGANSGGGSSGGASGGG
80 90 100 110 120 130
50 60 70 80 90 100
pF1KB9 GGGAKANQDRVKRPMNAFMVWSRGQRRKMAQENPKMHNSEISKRLGAEWKVMSEAEKRPF
:: ..::::::::::::::::::::::: ::::::::::::::::.::....::::::
CCDS14 GG---TDQDRVKRPMNAFMVWSRGQRRKMALENPKMHNSEISKRLGADWKLLTDAEKRPF
140 150 160 170 180
110 120 130 140 150 160
pF1KB9 IDEAKRLRALHMKEHPDYKYRPRRKTKTLLKKDKYSLAGGLLAAGAGGGGAAVAMGVGVG
:::::::::.::::.:::::::::::::::::::::: .::: ::....::.: .....
CCDS14 IDEAKRLRAVHMKEYPDYKYRPRRKTKTLLKKDKYSLPSGLLPPGAAAAAAAAAAAAAAA
190 200 210 220 230 240
170 180 190 200 210 220
pF1KB9 VGAAAVGQRLESPGGAAGGGYAHVNGWANGAYPGSVAAAAAAAAMMQEAQLAYGQHPGAG
. ..:::::.. :.:::::::::: ...:: ::.:.: :. .
CCDS14 SSPVGVGQRLDT--------YTHVNGWANGAY-----------SLVQE-QLGYAQPPSMS
250 260 270 280
230 240 250 260 270
pF1KB9 GAHPHAHPAHPHPHHPHAHPHNPQPMHRYDMGALQYSPI--SNSQGYMS-----ASPSGY
. : : : : : :::::::..:::::. ..:.::. :. :::
CCDS14 S---------PPP--PPALP----PMHRYDMAGLQYSPMMPPGAQSYMNVAAAAAAASGY
290 300 310 320 330
280 290 300 310 320 330
pF1KB9 GGLPYGAAAAAAAAAGGAHQNSAVAAAAAAAAASSGALGALGSLVKSEPSGSPPAPA-HS
::. .:.:::::: : :. :.:::::::::. .:: .::.::::::. ::: : ::
CCDS14 GGMAPSATAAAAAAYG---QQPATAAAAAAAAAAM-SLGPMGSVVKSEPSSPPPAIASHS
340 350 360 370 380
340 350 360 370 380 390
pF1KB9 RAPCPGDLREMISMYLPAGEGGDPAAAAAAAAQSRLHSLPQHYQGAGAGVNGTVPLTHI
. : ::::.::::::: ::: : ::. .:::.. :::::::..::::::::::
CCDS14 QRACLGDLRDMISMYLPP--GGDAADAASPLPGGRLHGVHQHYQGAGTAVNGTVPLTHI
390 400 410 420 430 440
>>CCDS3239.1 SOX2 gene_id:6657|Hs108|chr3 (317 aa)
initn: 1037 init1: 728 opt: 780 Z-score: 475.4 bits: 96.5 E(32554): 3.7e-20
Smith-Waterman score: 1167; 52.9% identity (69.4% similar) in 399 aa overlap (1-391:1-317)
10 20 30 40 50 60
pF1KB9 MYSMMMETDLHSPGGAQAPTNLSGPAGAGGGGGGGGGGGGGGGAKANQDRVKRPMNAFMV
::.:: ::.:. :: :. .:::::.. ....::. : . ::::::::::::
CCDS32 MYNMM-ETELKPPGPQQT---------SGGGGGNSTAAAAGGNQKNSPDRVKRPMNAFMV
10 20 30 40 50
70 80 90 100 110 120
pF1KB9 WSRGQRRKMAQENPKMHNSEISKRLGAEWKVMSEAEKRPFIDEAKRLRALHMKEHPDYKY
::::::::::::::::::::::::::::::..::.:::::::::::::::::::::::::
CCDS32 WSRGQRRKMAQENPKMHNSEISKRLGAEWKLLSETEKRPFIDEAKRLRALHMKEHPDYKY
60 70 80 90 100 110
130 140 150 160 170
pF1KB9 RPRRKTKTLLKKDKYSLAGGLLAAGAGGGGAAVAMGVGVGVG-AAAVGQRLESPGGAAGG
:::::::::.:::::.: ::::: : : ..: :::::.: .:.:.::..:
CCDS32 RPRRKTKTLMKKDKYTLPGGLLAPG----GNSMASGVGVGAGLGAGVNQRMDS-------
120 130 140 150
180 190 200 210 220 230
pF1KB9 GYAHVNGWANGAYPGSVAAAAAAAAMMQEAQLAYGQHPGAGGAHPHAHPAHPHPHHPHAH
:::.:::.::.: .:::. ::.: :::: .:: :
CCDS32 -YAHMNGWSNGSY-----------SMMQD-QLGYPQHPGL-----NAHGAAQM-------
160 170 180 190
240 250 260 270 280 290
pF1KB9 PHNPQPMHRYDMGALQYSPISNSQGYMSASPSGYGGLPYGAAAAAAAAAGGAHQNSAVAA
:::::::..::::. ...:: ::..::. :. . . .. : :
CCDS32 ----QPMHRYDVSALQYNSMTSSQTYMNGSPT------YSMSYSQQGTPGMA--------
200 210 220 230
300 310 320 330 340 350
pF1KB9 AAAAAAASSGALGALGSLVKSEPSGSPP---APAHSRAPCP-GDLREMISMYLPAGEGGD
::..::.:::: :.::: . .:::::: ::::.:::::::..: .
CCDS32 -----------LGSMGSVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMISMYLPGAEVPE
240 250 260 270 280
360 370 380 390
pF1KB9 PAAAAAAAAQSRLHSLPQHYQGA---GAGVNGTVPLTHI
::: :::: . ::::.. :...:::.::.:.
CCDS32 PAAP------SRLH-MSQHYQSGPVPGTAINGTLPLSHM
290 300 310
>>CCDS9473.1 SOX21 gene_id:11166|Hs108|chr13 (276 aa)
initn: 742 init1: 555 opt: 611 Z-score: 378.2 bits: 78.3 E(32554): 9.6e-15
Smith-Waterman score: 706; 46.4% identity (64.1% similar) in 323 aa overlap (49-364:6-275)
20 30 40 50 60 70
pF1KB9 PTNLSGPAGAGGGGGGGGGGGGGGGAKANQDRVKRPMNAFMVWSRGQRRKMAQENPKMHN
:.:::::::::::::.::::::::::::::
CCDS94 MSKPVDHVKRPMNAFMVWSRAQRRKMAQENPKMHN
10 20 30
80 90 100 110 120 130
pF1KB9 SEISKRLGAEWKVMSEAEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTLLKKDKYSLA
::::::::::::...:.::::::::::::::.::::::::::::::: ::::::::...
CCDS94 SEISKRLGAEWKLLTESEKRPFIDEAKRLRAMHMKEHPDYKYRPRRKPKTLLKKDKFAFP
40 50 60 70 80 90
140 150 160 170 180 190
pF1KB9 GGLLAAGAGGGGAAVAMGVGVGVGAAAVGQRLESPGGAAGGGYAHVNGWANGAYPGSVAA
. : :: . : .. .:.: : .:::: . . :: : ..::
CCDS94 ---VPYGLGGVADAEHPALKAGAGLHA----------GAGGGLVPESLLAN---PEKAAA
100 110 120 130
200 210 220 230 240 250
pF1KB9 AAAAAAMMQEAQLAYGQHPGAGGAHPHAHPAHPHPHHPHAHPHNPQPMHRYDMGALQYSP
:::::: :.. . : .:..: : : .:. :.:. ...
CCDS94 AAAAAA----ARVFFPQSAAAAAAAAAAAAAG-------------SPYSLLDLGS-KMAE
140 150 160 170 180
260 270 280 290 300 310
pF1KB9 ISNSQGYMSASPSGYGGLPYGAAAAAAAAAGGAHQNSAVAAAAAAAAASSGALGALGSLV
::.:.. ::::... . .:..:: ...:.::::::::: :. .
CCDS94 ISSSSS----------GLPYASSLGYPTAGAGAFHGAAAAAAAAAAAA--------GGHT
190 200 210 220
320 330 340 350 360 370
pF1KB9 KSEPSGSPPA---PAHSRA-PCPGDLREMISMYLPAGEGG---DPAAAAAAAAQSRLHSL
.:.:: . :. : . : : :: . . :: : : :: :: :::
CCDS94 HSHPSPGNPGYMIPCNCSAWPSPGLQPPLAYILLP-GMGKPQLDPYPAAYAAAL
230 240 250 260 270
380 390
pF1KB9 PQHYQGAGAGVNGTVPLTHI
>>CCDS3094.1 SOX14 gene_id:8403|Hs108|chr3 (240 aa)
initn: 590 init1: 563 opt: 602 Z-score: 373.7 bits: 77.3 E(32554): 1.7e-14
Smith-Waterman score: 602; 46.5% identity (64.7% similar) in 241 aa overlap (49-285:6-239)
20 30 40 50 60 70
pF1KB9 PTNLSGPAGAGGGGGGGGGGGGGGGAKANQDRVKRPMNAFMVWSRGQRRKMAQENPKMHN
:..:::::::::::::::::::::::::::
CCDS30 MSKPSDHIKRPMNAFMVWSRGQRRKMAQENPKMHN
10 20 30
80 90 100 110 120 130
pF1KB9 SEISKRLGAEWKVMSEAEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTLLKKDKYSLA
::::::::::::..:::::::.::::::::: ::::::::::::::: :.:::::.: .
CCDS30 SEISKRLGAEWKLLSEAEKRPYIDEAKRLRAQHMKEHPDYKYRPRRKPKNLLKKDRYVFP
40 50 60 70 80 90
140 150 160 170 180 190
pF1KB9 GGLLAAGAGGGGAAVAMGVGVGVGAAAVGQRLESPGGAAGGGYAHVNGWANGAYP--GSV
:. .:.. .:.. :. .: : : ..: . ....: : :
CCDS30 LPYLGDTDPLKAAGLPVGASDGLLSAPEKARAFLPPASAPYSLLDPAQFSSSAIQKMGEV
100 110 120 130 140 150
200 210 220 230 240 250
pF1KB9 AAAAAAAAMMQEAQLAYGQHPGAGGAHPHAHPAHPHPH-HPHAHPHNPQPMHRYDMGALQ
. :..:. . :.: . :: :. . : : : : : :: . . : .
CCDS30 PHTLATGALPYASTLGY--QNGAFGSL-----SCPSQHTHTHPSPTNPGYVVPCNCTAWS
160 170 180 190 200
260 270 280 290 300 310
pF1KB9 YSPISNSQGYMSASPSGYGGL-PYGAAAAAAAAAGGAHQNSAVAAAAAAAAASSGALGAL
: .. .:. :. ::..: :.:
CCDS30 ASTLQPPVAYILFPGMTKTGIDPYSSAHATAM
210 220 230 240
320 330 340 350 360 370
pF1KB9 GSLVKSEPSGSPPAPAHSRAPCPGDLREMISMYLPAGEGGDPAAAAAAAAQSRLHSLPQH
>>CCDS32549.1 SOX15 gene_id:6665|Hs108|chr17 (233 aa)
initn: 476 init1: 446 opt: 485 Z-score: 306.1 bits: 64.7 E(32554): 9.9e-11
Smith-Waterman score: 495; 39.8% identity (59.8% similar) in 246 aa overlap (10-246:14-228)
10 20 30 40 50
pF1KB9 MYSMMMETDLHSPGGAQAPTNLSGPAGAGGGGGGGGGGGGGGGAKANQDRVKRPMN
:. :... : .. ::: :.:. .. : ..::::::
CCDS32 MALPGSSQDQAWSLEPPAATAAASSSSGPQEREGAGSPAAPG------TLPLEKVKRPMN
10 20 30 40 50
60 70 80 90 100 110
pF1KB9 AFMVWSRGQRRKMAQENPKMHNSEISKRLGAEWKVMSEAEKRPFIDEAKRLRALHMKEHP
:::::: .:::.:::.:::::::::::::::.::...: :::::..::::::: :....:
CCDS32 AFMVWSSAQRRQMAQQNPKMHNSEISKRLGAQWKLLDEDEKRPFVEEAKRLRARHLRDYP
60 70 80 90 100 110
120 130 140 150 160 170
pF1KB9 DYKYRPRRKTKTLLKKDKYSLAGGLLAAGAGGGGAAVAMGVGVGVGAAAVGQRLESPGGA
:::::::::.:. .::: . : : : : : : .:: :
CCDS32 DYKYRPRRKAKS---------------SGAGPSR------CGQGRGNLASGGPLWGPGYA
120 130 140 150
180 190 200 210 220
pF1KB9 A-----GGGYAHVNGWANGAYPGSVAAAAAAAAMMQEAQLAYGQHPGAGGAHP-HAH---
. : :: . ..... ::: ... . .: .. : : ..:
CCDS32 TTQPSRGFGY-RPPSYSTAYLPGSYGSSHCKLEAPSPCSLPQSDPRLQGELLPTYTHYLP
160 170 180 190 200 210
230 240 250 260 270 280
pF1KB9 PAHPHPHHPHAHPHNPQPMHRYDMGALQYSPISNSQGYMSASPSGYGGLPYGAAAAAAAA
:. : :..: : ::
CCDS32 PGSPTPYNP---PLAGAPMPLTHL
220 230
391 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Tue Nov 8 02:06:50 2016 done: Tue Nov 8 02:06:51 2016
Total Scan time: 3.140 Total Display time: 0.010
Function used was FASTA [36.3.4 Apr, 2011]