FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KB8988, 388 aa
1>>>pF1KB8988 388 - 388 aa - 388 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 9.1071+/-0.000803; mu= 1.5643+/- 0.048
mean_var=247.9370+/-50.117, 0's: 0 Z-trim(117.0): 53 B-trim: 4 in 1/51
Lambda= 0.081452
statistics sampled from 17619 (17672) to 17619 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.838), E-opt: 0.2 (0.543), width: 16
Scan time: 3.400
The best scores are: opt bits E(32554)
CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 ( 388) 2730 333.2 2.5e-91
CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 ( 414) 847 111.9 1.1e-24
CCDS13552.1 SOX18 gene_id:54345|Hs108|chr20 ( 384) 559 78.0 1.6e-14
CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 ( 509) 461 66.6 5.7e-11
>>CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 (388 aa)
initn: 2730 init1: 2730 opt: 2730 Z-score: 1753.1 bits: 333.2 E(32554): 2.5e-91
Smith-Waterman score: 2730; 100.0% identity (100.0% similar) in 388 aa overlap (1-388:1-388)
10 20 30 40 50 60
pF1KB8 MASLLGAYPWPEGLECPALDAELSDGQSPPAVPRPPGDKGSESRIRRPMNAFMVWAKDER
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 MASLLGAYPWPEGLECPALDAELSDGQSPPAVPRPPGDKGSESRIRRPMNAFMVWAKDER
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB8 KRLAVQNPDLHNAELSKMLGKSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRKK
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 KRLAVQNPDLHNAELSKMLGKSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRKK
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB8 QAKRLCKRVDPGFLLSSLSRDQNALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGCYHE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 QAKRLCKRVDPGFLLSSLSRDQNALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGCYHE
130 140 150 160 170 180
190 200 210 220 230 240
pF1KB8 GPAGGGGGGTPSSVDTYPYGLPTPPEMSPLDVLEPEQTFFSSPCQEEHGHPRRIPHLPGH
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 GPAGGGGGGTPSSVDTYPYGLPTPPEMSPLDVLEPEQTFFSSPCQEEHGHPRRIPHLPGH
190 200 210 220 230 240
250 260 270 280 290 300
pF1KB8 PYSPEYAPSPLHCSHPLGSLALGQSPGVSMMSPVPGCPPSPAYYSPATYHPLHSNLQAHL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 PYSPEYAPSPLHCSHPLGSLALGQSPGVSMMSPVPGCPPSPAYYSPATYHPLHSNLQAHL
250 260 270 280 290 300
310 320 330 340 350 360
pF1KB8 GQLSPPPEHPGFDALDQLSQVELLGDMDRNEFDQYLNTPGHPDSATGAMALSGHVPVSQV
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 GQLSPPPEHPGFDALDQLSQVELLGDMDRNEFDQYLNTPGHPDSATGAMALSGHVPVSQV
310 320 330 340 350 360
370 380
pF1KB8 TPTGPTETSLISVLADATATYYNSYSVS
::::::::::::::::::::::::::::
CCDS59 TPTGPTETSLISVLADATATYYNSYSVS
370 380
>>CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 (414 aa)
initn: 775 init1: 525 opt: 847 Z-score: 556.8 bits: 111.9 E(32554): 1.1e-24
Smith-Waterman score: 847; 43.3% identity (61.9% similar) in 404 aa overlap (5-385:27-411)
10 20 30
pF1KB8 MASLLGAYPWPEGLECPALDAELSDGQSPPAVPRPPGD
:: :: :.: : : ... :..: : :
CCDS61 MSSPDAGYASDDQSQTQSALPAVMAGLGPCPWAESLS-PIGDMKVK-GEAPANSGAPAGA
10 20 30 40 50
40 50 60 70 80 90
pF1KB8 KG---SESRIRRPMNAFMVWAKDERKRLAVQNPDLHNAELSKMLGKSWKALTLSQKRPYV
: .::::::::::::::::::::::: :::::::::::::::::::::::..:::.:
CCDS61 AGRAKGESRIRRPMNAFMVWAKDERKRLAQQNPDLHNAELSKMLGKSWKALTLAEKRPFV
60 70 80 90 100 110
100 110 120 130 140 150
pF1KB8 DEAERLRLQHMQDYPNYKYRPRRKKQAKRLCKRVDPGFLLSSLSRDQNAL--PEKRSGSR
.::::::.:::::.:::::::::.::.::: :::. ::: .:.. : : :: .
CCDS61 EEAERLRVQHMQDHPNYKYRPRRRKQVKRL-KRVEGGFL-HGLAEPQAAALGPEGGRVAM
120 130 140 150 160 170
160 170 180 190 200 210
pF1KB8 GALGEKEDRGEYSPGTAL--PSLRGCYHEGPAGGGGGGTPSSVDTYPYGLPTPPEMSPLD
.:: . . . : : : . : :.. . :.: .: :: :::: . ::::
CCDS61 DGLGLQFPEQGFPAGPPLLPPHMGGHYRDCQSL----GAPP-LDGYP--LPTP-DTSPLD
180 190 200 210 220
220 230 240 250 260
pF1KB8 VLEPEQTFFSSP----CQEEHGHPRRI-------PHLPGHPYSPEYAPSPLHCSHPLGSL
..:. .::..: : . :. :. :. :. .: : : : : :
CCDS61 GVDPDPAFFAAPMPGDCPAAGTYSYAQVSDYAGPPEPPAGPMHPRLGPEPAGPSIP-GLL
230 240 250 260 270 280
270 280 290 300 310
pF1KB8 ALGQSPGV---SMMSPVPGCPPSPAYYSPATYHPLHSNLQAHLGQLSPPPEH-PGFDALD
: .. : .: :: : . . .. :.. :: ::::: : :. :
CCDS61 APPSALHVYYGAMGSPGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPSPPPEALPCRDGTD
290 300 310 320 330 340
320 330 340 350 360 370
pF1KB8 QLSQVELLGDMDRNEFDQYLNTPGHPDSATGAMALSGHVPVSQVTPTGPTETSLISVLAD
. .::::..::.::.:::. .:. . . .:: : :. .. .. ::..:
CCDS61 PSQPAELLGEVDRTEFEQYLHFVCKPEMG---LPYQGHD--SGVN-LPDSHGAISSVVSD
350 360 370 380 390 400
380
pF1KB8 AT-ATYYNSYSVS
:. :.:: .:
CCDS61 ASSAVYYCNYPDV
410
>>CCDS13552.1 SOX18 gene_id:54345|Hs108|chr20 (384 aa)
initn: 745 init1: 480 opt: 559 Z-score: 374.4 bits: 78.0 E(32554): 1.6e-14
Smith-Waterman score: 782; 41.6% identity (58.2% similar) in 409 aa overlap (2-388:30-383)
10 20 30
pF1KB8 MASLLGAYPWPEGLECPALDAELSDGQ-SPPA
:. : : .: :: : . : :::
CCDS13 MQRSPPGYGAQDDPPARRDCAWAPGHGAAADTRGLAAGPAALAAPAAPASPPSPQRSPPR
10 20 30 40 50 60
40 50 60 70 80
pF1KB8 VPRP------PGDKG-----SESRIRRPMNAFMVWAKDERKRLAVQNPDLHNAELSKMLG
:.: :. .: .::::::::::::::::::::::: :::::::: ::::::
CCDS13 SPEPGRYGLSPAGRGERQAADESRIRRPMNAFMVWAKDERKRLAQQNPDLHNAVLSKMLG
70 80 90 100 110 120
90 100 110 120 130 140
pF1KB8 KSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRKKQAKRLCKRVDPGFLLSSLSR
:.:: :. ..:::.:.::::::.::..:.:::::::::::::.. .:..::.:: .:.
CCDS13 KAWKELNAAEKRPFVEEAERLRVQHLRDHPNYKYRPRRKKQARK-ARRLEPGLLLPGLAP
130 140 150 160 170
150 160 170 180 190 200
pF1KB8 DQNALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGCYHEGPAGGGGGGTPSSVDTYPYG
: :: .. : : :. ..: : : ... :
CCDS13 PQPP-PEPFPAASG------------------SARA-FRELP--------PLGAEFDGLG
180 190 200 210
210 220 230 240 250
pF1KB8 LPTPPEMSPLDVLEP-EQTFFSSPCQEEHGHPRRIPHLPGHPYSPEYAPSPLHCSHPLGS
:::: : :::: ::: : .:: : : : :. :::. : : :
CCDS13 LPTP-ERSPLDGLEPGEAAFFPPPAAPEDCALR--------PFRAPYAPTELS-RDPGGC
220 230 240 250 260
260 270 280 290 300 310
pF1KB8 LALGQSPGVSMMSPVPGCPPSPAYY----SPATYHPLHSNLQAHLGQLSPPPEHPGFDAL
: . .. . :. : . :: .:. : : : :::::: : ...
CCDS13 Y--GAPLAEALRTAPPAAPLAGLYYGTLGTPGPY-P---------GPLSPPPEAPPLESA
270 280 290 300
320 330 340 350 360 370
pF1KB8 DQLSQV-ELLGDMDRNEFDQYLN-TPGHPDSATGAMALSGHVPVSQVTPTG---PTETSL
. :. . .: .:.: .::::::: . .:: : : : :: .... : . : :.::
CCDS13 EPLGPAADLWADVDLTEFDQYLNCSRTRPD-APG---LPYHVALAKLGPRAMSCPEESSL
310 320 330 340 350 360
380
pF1KB8 ISVLADATATYYNSYSVS
::.:.::... : : .:
CCDS13 ISALSDASSAVYYSACISG
370 380
>>CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 (509 aa)
initn: 479 init1: 383 opt: 461 Z-score: 310.5 bits: 66.6 E(32554): 5.7e-11
Smith-Waterman score: 461; 32.3% identity (54.7% similar) in 322 aa overlap (30-341:90-406)
10 20 30 40 50
pF1KB8 MASLLGAYPWPEGLECPALDAELSDGQSPPAVPRPPGDKGSESRIRRPMNAFMVWAKDE
: : :.. .. ...::::::::::.
CCDS11 LKKESEEDKFPVCIREAVSQVLKGYDWTLVPMPVRVNGSSKNKPHVKRPMNAFMVWAQAA
60 70 80 90 100 110
60 70 80 90 100 110
pF1KB8 RKRLAVQNPDLHNAELSKMLGKSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRK
:..:: : : :::::::: ::: :. :. :.:::.:.::::::.:: .:.:.:::.:::.
CCDS11 RRKLADQYPHLHNAELSKTLGKLWRLLNESEKRPFVEEAERLRVQHKKDHPDYKYQPRRR
120 130 140 150 160 170
120 130 140 150 160 170
pF1KB8 KQAKRLCKRVDPGFLLSSLSRDQ--NALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGC
:..: ... . . .: . .:: : ....: .. ::.: . :
CCDS11 KSVKNGQAEAEEATEQTHISPNAIFKALQADSPHSSSGMSEVHSPGEHSGQSQGPPTPPT
180 190 200 210 220 230
180 190 200 210 220 230
pF1KB8 YHEGPAGGGGGGTPSSVDTYPYGLPTPP-EMSPLDVLEPEQTFFSSPCQEEHGHPRRIPH
. . : . : : :: .. .:. : . .:. : . .
CCDS11 TPKTDVQPGKADLKREGRPLPEGGRQPPIDFRDVDIGELSSDVISN--IETFDVNEFDQY
240 250 260 270 280 290
240 250 260 270 280
pF1KB8 LP--GHPYSPE-YAPSPLHCSHPLGSLA-LGQSPGVSMMSP--VPGCPPS-PAYYSPATY
:: ::: : .. :. ..: : : : :: .: ::. : ::
CCDS11 LPPNGHPGVPATHGQVTYTGSYGISSTAATPASAGHVWMSKQQAPPPPPQQPPQAPPAPQ
300 310 320 330 340 350
290 300 310 320 330 340
pF1KB8 HPLHSNLQAHLGQLSPPPEHPGFDALDQLSQVELLGDMDRNEFDQYLNTPGHPDSATGAM
: . . : : . ::..: .: ::. :. .:... .:.:
CCDS11 APPQPQ-AAPPQQPAAPPQQPQAHTLTTLSSEP--GQSQRTHIKTEQLSPSHYSEQQQHS
360 370 380 390 400 410
350 360 370 380
pF1KB8 ALSGHVPVSQVTPTGPTETSLISVLADATATYYNSYSVS
CCDS11 PQQIAYSPFNLPHYSPSYPPITRSQYDYTDHQNSSSYYSHAAGQGTGLYSTFTYMNPAQR
420 430 440 450 460 470
388 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Fri Nov 4 16:52:00 2016 done: Fri Nov 4 16:52:00 2016
Total Scan time: 3.400 Total Display time: 0.000
Function used was FASTA [36.3.4 Apr, 2011]