FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KB9552, 466 aa
1>>>pF1KB9552 466 - 466 aa - 466 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 8.7342+/-0.000787; mu= 4.2724+/- 0.048
mean_var=207.8376+/-41.944, 0's: 0 Z-trim(116.0): 58 B-trim: 241 in 2/52
Lambda= 0.088964
statistics sampled from 16520 (16578) to 16520 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.802), E-opt: 0.2 (0.509), width: 16
Scan time: 3.190
The best scores are: opt bits E(32554)
CCDS13964.1 SOX10 gene_id:6663|Hs108|chr22 ( 466) 3261 430.6 1.6e-120
CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 ( 509) 1350 185.4 1.2e-46
CCDS10428.1 SOX8 gene_id:30812|Hs108|chr16 ( 446) 869 123.6 4.1e-28
CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 ( 414) 436 68.0 2.1e-11
CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 ( 315) 422 66.1 5.9e-11
>>CCDS13964.1 SOX10 gene_id:6663|Hs108|chr22 (466 aa)
initn: 3261 init1: 3261 opt: 3261 Z-score: 2277.1 bits: 430.6 E(32554): 1.6e-120
Smith-Waterman score: 3261; 100.0% identity (100.0% similar) in 466 aa overlap (1-466:1-466)
10 20 30 40 50 60
pF1KB9 MAEEQDLSEVELSPVGSEEPRCLSPGSAPSLGPDGGGGGSGLRASPGPGELGKVKKEQQD
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS13 MAEEQDLSEVELSPVGSEEPRCLSPGSAPSLGPDGGGGGSGLRASPGPGELGKVKKEQQD
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB9 GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVWAQAARR
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS13 GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVWAQAARR
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB9 KLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQPRRRKN
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS13 KLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQPRRRKN
130 140 150 160 170 180
190 200 210 220 230 240
pF1KB9 GKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDGNPEHPSGQSHGPPT
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS13 GKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDGNPEHPSGQSHGPPT
190 200 210 220 230 240
250 260 270 280 290 300
pF1KB9 PPTTPKTELQSGKADPKRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMETFDVAELDQY
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS13 PPTTPKTELQSGKADPKRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMETFDVAELDQY
250 260 270 280 290 300
310 320 330 340 350 360
pF1KB9 LPPNGHPGHVSSYSAAGYGLGSALAVASGHSAWISKPPGVALPTVSPPGVDAKAQVKTET
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS13 LPPNGHPGHVSSYSAAGYGLGSALAVASGHSAWISKPPGVALPTVSPPGVDAKAQVKTET
310 320 330 340 350 360
370 380 390 400 410 420
pF1KB9 AGPQGPPHYTDQPSTSQIAYTSLSLPHYGSAFPSISRPQFDYSDHQPSGPYYGHSGQASG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS13 AGPQGPPHYTDQPSTSQIAYTSLSLPHYGSAFPSISRPQFDYSDHQPSGPYYGHSGQASG
370 380 390 400 410 420
430 440 450 460
pF1KB9 LYSAFSYMGPSQRPLYTAISDPSPSGPQSHSPTHWEQPVYTTLSRP
::::::::::::::::::::::::::::::::::::::::::::::
CCDS13 LYSAFSYMGPSQRPLYTAISDPSPSGPQSHSPTHWEQPVYTTLSRP
430 440 450 460
>>CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 (509 aa)
initn: 1439 init1: 805 opt: 1350 Z-score: 951.0 bits: 185.4 E(32554): 1.2e-46
Smith-Waterman score: 1624; 54.0% identity (72.2% similar) in 493 aa overlap (18-456:13-499)
10 20 30 40 50
pF1KB9 MAEEQDLSEVELSPVGSEEPRCLSPGSAPSLGPDGGGG----GSGLRASPGPGELGKVKK
:. . :: . .:... :..:. ::: . . . :
CCDS11 MNLLDPFMKMTDEQEKGLSGAPSPTMSEDSAGSPCPSGSGSDTENTRPQENTFPK
10 20 30 40 50
60 70 80 90 100 110
pF1KB9 EQQD--GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVW
. : :...:::::::::::::::.:::::::::::::::.::.::::::::::::::
CCDS11 GEPDLKKESEEDKFPVCIREAVSQVLKGYDWTLVPMPVRVNGSSKNKPHVKRPMNAFMVW
60 70 80 90 100 110
120 130 140 150 160 170
pF1KB9 AQAARRKLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQ
::::::::::::::::::::::::::::::::::.::::.:::::::.::::::::::::
CCDS11 AQAARRKLADQYPHLHNAELSKTLGKLWRLLNESEKRPFVEEAERLRVQHKKDHPDYKYQ
120 130 140 150 160 170
180 190 200 210 220 230
pF1KB9 PRRRKNGKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSD-GNPEHPSG
:::::. : .:.::: :: . . .: .:. . : : . : ::. .: . ::
CCDS11 PRRRKSVKNGQAEAE----EATEQTHISPNAIFKALQADSPHSSSG--MSEVHSPGEHSG
180 190 200 210 220
240 250 260 270 280 290
pF1KB9 QSHGPPTPPTTPKTELQSGKADPKRDGRSMGEGGK-PHIDFGNVDIGEISHEVMSNMETF
::.:::::::::::..: :::: ::.:: . :::. : ::: .:::::.: .:.::.:::
CCDS11 QSQGPPTPPTTPKTDVQPGKADLKREGRPLPEGGRQPPIDFRDVDIGELSSDVISNIETF
230 240 250 260 270 280
300 310 320 330 340
pF1KB9 DVAELDQYLPPNGHPG----HVSSYSAAGYGLGSALAV-ASGHSAWISK----PPGVALP
:: :.::::::::::: : . ...::..:. :. ::. .:.:: :: :
CCDS11 DVNEFDQYLPPNGHPGVPATHGQVTYTGSYGISSTAATPASAGHVWMSKQQAPPPPPQQP
290 300 310 320 330 340
350 360 370
pF1KB9 TVSPPGVDAKAQVKTE-----TAGPQ---------------------------GPPHYTD
.::. .: : .. .: :: .: ::..
CCDS11 PQAPPAPQAPPQPQAAPPQQPAAPPQQPQAHTLTTLSSEPGQSQRTHIKTEQLSPSHYSE
350 360 370 380 390 400
380 390 400 410 420
pF1KB9 QP--STSQIAYTSLSLPHYGSAFPSISRPQFDYSDHQPSGPYYGHS-GQASGLYSAFSYM
: : .::::. ..::::. ..: :.: :.::.::: :. ::.:. ::..::::.:.::
CCDS11 QQQHSPQQIAYSPFNLPHYSPSYPPITRSQYDYTDHQNSSSYYSHAAGQGTGLYSTFTYM
410 420 430 440 450 460
430 440 450 460
pF1KB9 GPSQRPLYTAISDPS--PSGPQSHSPTHWEQPVYTTLSRP
.:.:::.:: :.: : :: ::.::: :::
CCDS11 NPAQRPMYTPIADTSGVPSIPQTHSPQHWEQPVYTQLTRP
470 480 490 500
>>CCDS10428.1 SOX8 gene_id:30812|Hs108|chr16 (446 aa)
initn: 1010 init1: 579 opt: 869 Z-score: 618.1 bits: 123.6 E(32554): 4.1e-28
Smith-Waterman score: 1281; 50.2% identity (68.1% similar) in 474 aa overlap (10-466:2-446)
10 20 30 40 50
pF1KB9 MAEEQDLSEVELSPVGSEEPRCLSPGSAPSLG--PDGGGGGSGLRA-SPGPGELG-KVKK
...: . :. : : :.: :.. :. . . : : : :. : :
CCDS10 MLDMSEARSQPP-CSPSGTASSMSHVEDSDSDAPPSPAGSEGLGRAGVAVGG
10 20 30 40 50
60 70 80 90 100 110
pF1KB9 EQQD-GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNG--ASKSKPHVKRPMNAFMV
. : .:: :..::.:::.::::::.::::.::::::: .: : :.:::::::::::::
CCDS10 ARGDPAEAADERFPACIRDAVSQVLKGYDWSLVPMPVRGGGGGALKAKPHVKRPMNAFMV
60 70 80 90 100 110
120 130 140 150 160 170
pF1KB9 WAQAARRKLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKY
::::::::::::::::::::::::::::::::.::.::::.:::::::.:::::::::::
CCDS10 WAQAARRKLADQYPHLHNAELSKTLGKLWRLLSESEKRPFVEEAERLRVQHKKDHPDYKY
120 130 140 150 160 170
180 190 200 210 220 230
pF1KB9 QPRRRKNGKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDGNPEHPSG
::::::..:: :... .: :: : . : ::. . : :. :. : .:
CCDS10 QPRRRKSAKA--GHSDSDSG-AELGPHPGGGAVYKA------EAGLGDGHHHGD--H-TG
180 190 200 210
240 250 260 270 280 290
pF1KB9 QSHGPPTPPTTPKTELQSGKADP--KRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMET
:.:::::::::::::::.. : : : .:: ..:. .:::.::::.:.: :::..:..
CCDS10 QTHGPPTPPTTPKTELQQAGAKPELKLEGRRPVDSGRQNIDFSNVDISELSSEVMGTMDA
220 230 240 250 260 270
300 310 320 330 340
pF1KB9 FDVAELDQYLPPNGHPGHVSSYSAAGYGLGSALAVASGHSAWISK--PPGVALPTVSPPG
::: :.::::: .: :. : . :.: :.. .: : : . : :: . :
CCDS10 FDVHEFDQYLPLGG-PAP----PEPGQAYGGAYFHAGASPVWAHKSAPSASASPTETGP-
280 290 300 310 320 330
350 360 370 380 390 400
pF1KB9 VDAKAQVKTETAGPQGPPHYTDQPSTSQIAYTSLSLPHYGSAFPSISRPQF-----DYSD
. ..::: :. : :: ::: : : : : .:: :. : ::.:
CCDS10 --PRPHIKTEQ--PS-PGHYGDQPRGSP-DYGSCS--GQSSATPAAPAGPFAGSQGDYGD
340 350 360 370 380
410 420 430 440 450 460
pF1KB9 HQPSGPYYGHSGQASGLYSAFSYMGPSQRPLYTAISDPSPSGPQSHSPT-HWEQPVYTTL
: :. : .. : : :::. . .: .:: . . . . . : .:::: ::.:::::::
CCDS10 LQASSYYGAYPGYAPGLYQYPCFHSP-RRPYASPLLN-GLALPPAHSPTSHWDQPVYTTL
390 400 410 420 430 440
pF1KB9 SRP
.::
CCDS10 TRP
>>CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 (414 aa)
initn: 451 init1: 410 opt: 436 Z-score: 318.2 bits: 68.0 E(32554): 2.1e-11
Smith-Waterman score: 466; 35.4% identity (55.4% similar) in 325 aa overlap (91-410:55-330)
70 80 90 100 110 120
pF1KB9 GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVWAQAARR
:. . : .:.. ...::::::::::. :.
CCDS61 AGLGPCPWAESLSPIGDMKVKGEAPANSGAPAGAAGRAKGESRIRRPMNAFMVWAKDERK
30 40 50 60 70 80
130 140 150 160 170 180
pF1KB9 KLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQPRRRKN
.::.: : :::::::: ::: :. :. ..::::.:::::::.:: .:::.:::.:::::.
CCDS61 RLAQQNPDLHNAELSKMLGKSWKALTLAEKRPFVEEAERLRVQHMQDHPNYKYRPRRRKQ
90 100 110 120 130 140
190 200 210 220 230
pF1KB9 GKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDG-NPEHP-SGQSHGP
: . . :: . : : :: : : : : :: . . : .: ::
CCDS61 VKRLK---RVEGGFLH--GLAEPQA----AALG---PEGGRVAMDGLGLQFPEQGFPAGP
150 160 170 180 190
240 250 260 270 280 290
pF1KB9 PTPPTTPKTELQSGKADPKRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMETFDVAELD
: : :. ..:. :: .:.: : .: : . : :.. ::
CCDS61 PLLP--PH---MGGHY---RDCQSLGA---PPLD------GY-------PLPTPDTSPLD
200 210 220
300 310 320 330 340 350
pF1KB9 QYLPPNGHPGHVSSYSAAGYGLGSALAVASGHSAWISKPPGVALPTVSPPGVDAKAQVKT
: : .. :: . :. :... : .: : : ::. . ..
CCDS61 GVDPD---P----AFFAAPMP-GDCPAAGTYSYAQVSDYAG---PP-EPPAGPMHPRLGP
230 240 250 260 270
360 370 380 390 400 410
pF1KB9 ETAGPQGPPHYTDQPSTSQIAYTSLSLPHYGSAFPSISRPQFDYS---DHQPSGPYYGHS
: :::. : ::. .. : ... : :.. .:: ... .:.: ::
CCDS61 EPAGPS-IPGLLAPPSALHVYYGAMGSPGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPSP
280 290 300 310 320 330
420 430 440 450 460
pF1KB9 GQASGLYSAFSYMGPSQRPLYTAISDPSPSGPQSHSPTHWEQPVYTTLSRP
CCDS61 PPEALPCRDGTDPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNLPDSHGAI
340 350 360 370 380 390
>>CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 (315 aa)
initn: 463 init1: 414 opt: 422 Z-score: 310.2 bits: 66.1 E(32554): 5.9e-11
Smith-Waterman score: 446; 33.2% identity (57.2% similar) in 271 aa overlap (103-369:39-285)
80 90 100 110 120 130
pF1KB9 REAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVWAQAARRKLADQYPHLHNA
:.::::::::::.: :::. ::.: .:::
CCDS12 AKRDGGPPPPGPGPAEEGAREPGWCKTPSGHIKRPMNAFMVWSQHERRKIMDQWPDMHNA
10 20 30 40 50 60
140 150 160 170 180 190
pF1KB9 ELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQPRRRKNGKAAQGEAECPG
:.:: ::. :.::..:.: ::..::::::..: :.:::::.::....: :... . ::
CCDS12 EISKRLGRRWQLLQDSEKIPFVREAERLRLKHMADYPDYKYRPRKKSKGAPAKARPRPPG
70 80 90 100 110 120
200 210 220 230 240 250
pF1KB9 GEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDGNPEHPSGQSHGPPTPPTTPKTELQSG
: .:: . .. . .: ::.:. . :.: .: . .: ::
CCDS12 G---SGGGSRLKP---GPQL----PGRGGRRAAGGPL--GGGAAAPEDDDEDDDEELLEV
130 140 150 160 170
260 270 280 290 300 310
pF1KB9 KADPKRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMETFDVAELDQYLPPNGHPGHVSS
. . :: . . . : . :. . . : .: . : . . .
CCDS12 RL-VETPGRELWR----MVPAGRAARGQAERAQGPSGEGAAAAAAASPTPSEDEEPEEEE
180 190 200 210 220 230
320 330 340 350 360
pF1KB9 YSAAGYGLGSALAVASGHSA--WISK-PPGVALPTVSPPGVDAKAQVKT-ETAGPQGPPH
::. : .::::. . ..:. ::: : :.: .: . . :.: :
CCDS12 EEAAAAEEGEEETVASGEESLGFLSRLPPG-------PAGLDCSALDRDPDLQPPSGTSH
240 250 260 270 280
370 380 390 400 410 420
pF1KB9 YTDQPSTSQIAYTSLSLPHYGSAFPSISRPQFDYSDHQPSGPYYGHSGQASGLYSAFSYM
.
CCDS12 FEFPDYCTPEVTEMIAGDWRPSSIADLVFTY
290 300 310
466 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Sat Nov 5 02:06:27 2016 done: Sat Nov 5 02:06:28 2016
Total Scan time: 3.190 Total Display time: 0.030
Function used was FASTA [36.3.4 Apr, 2011]