FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KB7755, 446 aa
1>>>pF1KB7755 446 - 446 aa - 446 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 8.4289+/-0.000812; mu= 6.2493+/- 0.049
mean_var=244.6231+/-50.353, 0's: 0 Z-trim(116.6): 58 B-trim: 173 in 1/54
Lambda= 0.082002
statistics sampled from 17209 (17268) to 17209 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.817), E-opt: 0.2 (0.53), width: 16
Scan time: 3.080
The best scores are: opt bits E(32554)
CCDS10428.1 SOX8 gene_id:30812|Hs108|chr16 ( 446) 3155 385.8 4.8e-107
CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 ( 509) 1178 151.9 1.3e-36
CCDS13964.1 SOX10 gene_id:6663|Hs108|chr22 ( 466) 869 115.4 1.3e-25
CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 ( 414) 482 69.5 7.1e-12
>>CCDS10428.1 SOX8 gene_id:30812|Hs108|chr16 (446 aa)
initn: 3155 init1: 3155 opt: 3155 Z-score: 2035.3 bits: 385.8 E(32554): 4.8e-107
Smith-Waterman score: 3155; 100.0% identity (100.0% similar) in 446 aa overlap (1-446:1-446)
10 20 30 40 50 60
pF1KB7 MLDMSEARSQPPCSPSGTASSMSHVEDSDSDAPPSPAGSEGLGRAGVAVGGARGDPAEAA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS10 MLDMSEARSQPPCSPSGTASSMSHVEDSDSDAPPSPAGSEGLGRAGVAVGGARGDPAEAA
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB7 DERFPACIRDAVSQVLKGYDWSLVPMPVRGGGGGALKAKPHVKRPMNAFMVWAQAARRKL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS10 DERFPACIRDAVSQVLKGYDWSLVPMPVRGGGGGALKAKPHVKRPMNAFMVWAQAARRKL
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB7 ADQYPHLHNAELSKTLGKLWRLLSESEKRPFVEEAERLRVQHKKDHPDYKYQPRRRKSAK
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS10 ADQYPHLHNAELSKTLGKLWRLLSESEKRPFVEEAERLRVQHKKDHPDYKYQPRRRKSAK
130 140 150 160 170 180
190 200 210 220 230 240
pF1KB7 AGHSDSDSGAELGPHPGGGAVYKAEAGLGDGHHHGDHTGQTHGPPTPPTTPKTELQQAGA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS10 AGHSDSDSGAELGPHPGGGAVYKAEAGLGDGHHHGDHTGQTHGPPTPPTTPKTELQQAGA
190 200 210 220 230 240
250 260 270 280 290 300
pF1KB7 KPELKLEGRRPVDSGRQNIDFSNVDISELSSEVMGTMDAFDVHEFDQYLPLGGPAPPEPG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS10 KPELKLEGRRPVDSGRQNIDFSNVDISELSSEVMGTMDAFDVHEFDQYLPLGGPAPPEPG
250 260 270 280 290 300
310 320 330 340 350 360
pF1KB7 QAYGGAYFHAGASPVWAHKSAPSASASPTETGPPRPHIKTEQPSPGHYGDQPRGSPDYGS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS10 QAYGGAYFHAGASPVWAHKSAPSASASPTETGPPRPHIKTEQPSPGHYGDQPRGSPDYGS
310 320 330 340 350 360
370 380 390 400 410 420
pF1KB7 CSGQSSATPAAPAGPFAGSQGDYGDLQASSYYGAYPGYAPGLYQYPCFHSPRRPYASPLL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS10 CSGQSSATPAAPAGPFAGSQGDYGDLQASSYYGAYPGYAPGLYQYPCFHSPRRPYASPLL
370 380 390 400 410 420
430 440
pF1KB7 NGLALPPAHSPTSHWDQPVYTTLTRP
::::::::::::::::::::::::::
CCDS10 NGLALPPAHSPTSHWDQPVYTTLTRP
430 440
>>CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 (509 aa)
initn: 1081 init1: 586 opt: 1178 Z-score: 770.5 bits: 151.9 E(32554): 1.3e-36
Smith-Waterman score: 1243; 48.5% identity (67.0% similar) in 470 aa overlap (16-419:19-480)
10 20 30 40 50
pF1KB7 MLDMSEARSQPPCSPSGTASSMSHVEDSDSDAPPSPAGSEGLGRAGVAVGGARGDP-
:: : : . ::: .. :: .::. . .:.:
CCDS11 MNLLDPFMKMTDEQEKGLSG-APSPTMSEDSAGSPCPSGSGSDTENTRPQENTFPKGEPD
10 20 30 40 50
60 70 80 90 100 110
pF1KB7 --AEAADERFPACIRDAVSQVLKGYDWSLVPMPVRGGGGGALKAKPHVKRPMNAFMVWAQ
:. ...::.:::.:::::::::::.::::::: .:.. : ::::::::::::::::
CCDS11 LKKESEEDKFPVCIREAVSQVLKGYDWTLVPMPVRVNGSS--KNKPHVKRPMNAFMVWAQ
60 70 80 90 100 110
120 130 140 150 160 170
pF1KB7 AARRKLADQYPHLHNAELSKTLGKLWRLLSESEKRPFVEEAERLRVQHKKDHPDYKYQPR
:::::::::::::::::::::::::::::.::::::::::::::::::::::::::::::
CCDS11 AARRKLADQYPHLHNAELSKTLGKLWRLLNESEKRPFVEEAERLRVQHKKDHPDYKYQPR
120 130 140 150 160 170
180 190 200 210 220
pF1KB7 RRKSAKAGHSDSDSGAELGPHPGGGAVYKA--------EAGLGDGHHHGDHTGQTHGPPT
::::.: :..... ..: : . .:..:: .:... : :.:.::..::::
CCDS11 RRKSVKNGQAEAEEATEQ-THISPNAIFKALQADSPHSSSGMSEVHSPGEHSGQSQGPPT
180 190 200 210 220 230
230 240 250 260 270 280
pF1KB7 PPTTPKTELQQAGAKPELKLEGRRPVDSGRQN-IDFSNVDISELSSEVMGTMDAFDVHEF
:::::::..: . : .:: ::: ..::: ::: .:::.::::.:......:::.::
CCDS11 PPTTPKTDVQPG--KADLKREGRPLPEGGRQPPIDFRDVDIGELSSDVISNIETFDVNEF
240 250 260 270 280 290
290 300 310 320
pF1KB7 DQYLPLGG-PA-PPEPGQA-YGGAY-------FHAGASPVWAHKS-APSA------SASP
::::: .: :. : ::. : :.: :.:. :: :. :: .: :
CCDS11 DQYLPPNGHPGVPATHGQVTYTGSYGISSTAATPASAGHVWMSKQQAPPPPPQQPPQAPP
300 310 320 330 340 350
330 340 350
pF1KB7 TETGPPRP---------------------------------HIKTEQPSPGHYGDQPRGS
. .::.: :::::: ::.::..: . :
CCDS11 APQAPPQPQAAPPQQPAAPPQQPQAHTLTTLSSEPGQSQRTHIKTEQLSPSHYSEQQQHS
360 370 380 390 400 410
360 370 380 390 400 410
pF1KB7 PDYGSCS--GQSSATPAAPAGPFAGSQGDYGDLQ-ASSYYGAYPGYAPGLYQYPCFHSP-
:. . : . .:. : :.. :: :: : : .::::. : . :::. . .:
CCDS11 PQQIAYSPFNLPHYSPSYP--PITRSQYDYTDHQNSSSYYSHAAGQGTGLYSTFTYMNPA
420 430 440 450 460 470
420 430 440
pF1KB7 RRPYASPLLNGLALPPAHSPTSHWDQPVYTTLTRP
.::. .:.
CCDS11 QRPMYTPIADTSGVPSIPQTHSPQHWEQPVYTQLTRP
480 490 500
>>CCDS13964.1 SOX10 gene_id:6663|Hs108|chr22 (466 aa)
initn: 1010 init1: 579 opt: 869 Z-score: 573.4 bits: 115.4 E(32554): 1.3e-25
Smith-Waterman score: 1281; 50.3% identity (68.4% similar) in 475 aa overlap (2-446:10-466)
10 20 30 40 50
pF1KB7 MLDMSEARSQPP-CSPSGTASSMSHVEDSDSDAPPSPAGSEGLGRAGVAVGG
...: . :. : : :.: :.. :. . . : : : :. : :
CCDS13 MAEEQDLSEVELSPVGSEEPRCLSPGSAPSLG--PDGGGGGSGLRA-SPGPGELG-KVKK
10 20 30 40 50
60 70 80 90 100 110
pF1KB7 ARGDPAEAADERFPACIRDAVSQVLKGYDWSLVPMPVRGGGGGALKAKPHVKRPMNAFMV
. : .:: :..::.:::.::::::.::::.::::::: .: : :.:::::::::::::
CCDS13 EQQD-GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNG--ASKSKPHVKRPMNAFMV
60 70 80 90 100 110
120 130 140 150 160 170
pF1KB7 WAQAARRKLADQYPHLHNAELSKTLGKLWRLLSESEKRPFVEEAERLRVQHKKDHPDYKY
::::::::::::::::::::::::::::::::.::.::::.:::::::.:::::::::::
CCDS13 WAQAARRKLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKY
120 130 140 150 160 170
180 190 200 210
pF1KB7 QPRRRKSAKA--GHSDSDSG-AELGPHPGGGAVYKA------EAGLGDGHHHGD--H-TG
::::::..:: :... .: :: : . : ::. . : :. :. : .:
CCDS13 QPRRRKNGKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDGNPEHPSG
180 190 200 210 220 230
220 230 240 250 260 270
pF1KB7 QTHGPPTPPTTPKTELQQAGAKPELKLEGRRPVDSGRQNIDFSNVDISELSSEVMGTMDA
:.:::::::::::::::.. : : : .:: ..:. .:::.::::.:.: :::..:..
CCDS13 QSHGPPTPPTTPKTELQSGKADP--KRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMET
240 250 260 270 280 290
280 290 300 310 320 330
pF1KB7 FDVHEFDQYLPLGG-PAP----PEPGQAYGGAYFHAGASPVWAHKSAPSASASPTETGPP
::: :.::::: .: :. : . :.: :.. .: : : . : :: ..::
CCDS13 FDVAELDQYLPPNGHPGHVSSYSAAGYGLGSALAVASGHSAWISK--PPGVALPT-VSPP
300 310 320 330 340
340 350 360 370 380
pF1KB7 ----RPHIKTEQ--PS-PGHYGDQPRGSP-DYGSCS--GQSSATPAAPAGPFAGSQGDYG
. ..::: :. : :: ::: : : : : .:: :. : ::.
CCDS13 GVDAKAQVKTETAGPQGPPHYTDQPSTSQIAYTSLSLPHYGSAFPSISRPQF-----DYS
350 360 370 380 390 400
390 400 410 420 430 440
pF1KB7 DLQASSYYGAYPGYAPGLYQYPCFHSP-RRPYASPLLN-GLALPPAHSPTSHWDQPVYTT
: : :. : .. : : :::. . .: .:: . . . . . : .:::: ::.::::::
CCDS13 DHQPSGPYYGHSGQASGLYSAFSYMGPSQRPLYTAISDPSPSGPQSHSPT-HWEQPVYTT
410 420 430 440 450 460
pF1KB7 LTRP
:.::
CCDS13 LSRP
>>CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 (414 aa)
initn: 496 init1: 424 opt: 482 Z-score: 326.6 bits: 69.5 E(32554): 7.1e-12
Smith-Waterman score: 509; 34.6% identity (54.4% similar) in 364 aa overlap (55-373:5-352)
30 40 50 60 70 80
pF1KB7 VEDSDSDAPPSPAGSEGLGRAGVAVGGARGDPAEAADERFPACIRDAVSQVLKGYD---W
: . :.:.. . ..:. :. : :
CCDS61 MSSPDAGYASDDQ--SQTQSALPAVMAGLGPCPW
10 20 30
90 100 110 120
pF1KB7 --SLVP---MPVRG----------GGGGALKAKPHVKRPMNAFMVWAQAARRKLADQYPH
:: : : :.: :..: :.. ...::::::::::. :..::.: :
CCDS61 AESLSPIGDMKVKGEAPANSGAPAGAAGRAKGESRIRRPMNAFMVWAKDERKRLAQQNPD
40 50 60 70 80 90
130 140 150 160 170 180
pF1KB7 LHNAELSKTLGKLWRLLSESEKRPFVEEAERLRVQHKKDHPDYKYQPRRRKSAK------
:::::::: ::: :. :. .:::::::::::::::: .:::.:::.:::::..:
CCDS61 LHNAELSKMLGKSWKALTLAEKRPFVEEAERLRVQHMQDHPNYKYRPRRRKQVKRLKRVE
100 110 120 130 140 150
190 200 210 220 230
pF1KB7 AG--HSDSD-SGAELGPHPGGGAVYKAEAGLGDGHHHGDHTGQTHGPPT-PPTTPK--TE
.: :. .. ..: :::. :: : : ::: . . : ::: :: .
CCDS61 GGFLHGLAEPQAAALGPE--GGRV--AMDGLG---LQFPEQGFPAGPPLLPPHMGGHYRD
160 170 180 190 200
240 250 260 270 280 290
pF1KB7 LQQAGAKPELKLEGRRPVDSGRQN-IDFSNVDISELSSEVMGTMDAFDVHEFDQYLPLGG
:. :: : :.: :. . . .: . : . ... . : : .. . : .:
CCDS61 CQSLGAPP---LDGY-PLPTPDTSPLDGVDPDPAFFAAPMPGDCPAAGTYSYAQVSDYAG
210 220 230 240 250 260
300 310 320 330
pF1KB7 PAPPEPGQAY-------GGAYFHAGASPVWAHKSAPSASASPTETG-------PPRPHIK
: : : . .: . . .: : . .: .:: : : . : .
CCDS61 PPEPPAGPMHPRLGPEPAGPSIPGLLAPPSALHVYYGAMGSPGAGGGRGFQMQPQHQHQH
270 280 290 300 310 320
340 350 360 370 380 390
pF1KB7 TEQPSPGHYGDQPRGSPDYGSCSGQSSATPAAPAGPFAGSQGDYGDLQASSYYGAYPGYA
.: : : :: :. : .... :. ::
CCDS61 QHQHHPPGPG-QPSPPPEALPC--RDGTDPSQPAELLGEVDRTEFEQYLHFVCKPEMGLP
330 340 350 360 370
400 410 420 430 440
pF1KB7 PGLYQYPCFHSPRRPYASPLLNGLALPPAHSPTSHWDQPVYTTLTRP
CCDS61 YQGHDSGVNLPDSHGAISSVVSDASSAVYYCNYPDV
380 390 400 410
446 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Fri Nov 4 09:23:25 2016 done: Fri Nov 4 09:23:25 2016
Total Scan time: 3.080 Total Display time: 0.000
Function used was FASTA [36.3.4 Apr, 2011]