FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB9430, 433 aa 1>>>pF1KB9430 433 - 433 aa - 433 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 9.3697+/-0.00107; mu= 2.8473+/- 0.066 mean_var=359.1846+/-74.642, 0's: 0 Z-trim(115.0): 56 B-trim: 0 in 0/52 Lambda= 0.067673 statistics sampled from 15491 (15539) to 15491 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.766), E-opt: 0.2 (0.477), width: 16 Scan time: 3.950 The best scores are: opt bits E(32554) CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 ( 474) 2096 218.3 1.3e-56 CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 ( 441) 781 89.8 5.6e-18 CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 ( 315) 647 76.6 3.9e-14 >>CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 (474 aa) initn: 2197 init1: 2058 opt: 2096 Z-score: 1129.7 bits: 218.3 E(32554): 1.3e-56 Smith-Waterman score: 2699; 91.2% identity (91.2% similar) in 465 aa overlap (1-424:1-465) 10 20 30 40 50 60 pF1KB9 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB9 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB9 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB9 GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASFAAE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASFAAE 190 200 210 220 230 240 250 260 270 280 290 300 pF1KB9 QAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYLFG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 QAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYLFG 250 260 270 280 290 300 310 pF1KB9 GLGTSSSP-----------------------------------------AAGRSPADHRG :::::::: ::::::::::: CCDS45 GLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPADHRG 310 320 330 340 350 360 320 330 340 350 360 370 pF1KB9 YASLRAASPAPSSAPSHASSSASSHSSSSSSSGSSSSDDEFEDDLLDLNPSSNFESMSLG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS45 YASLRAASPAPSSAPSHASSSASSHSSSSSSSGSSSSDDEFEDDLLDLNPSSNFESMSLG 370 380 390 400 410 420 380 390 400 410 420 430 pF1KB9 SFSSSSALDRDLDFNFEPGSGSHFEFPDYCTPEVSEMISGDWLESSISNLVFTY ::::::::::::::::::::::::::::::::::::::::::::: CCDS45 SFSSSSALDRDLDFNFEPGSGSHFEFPDYCTPEVSEMISGDWLESSISNLVFTY 430 440 450 460 470 >>CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 (441 aa) initn: 1098 init1: 628 opt: 781 Z-score: 436.3 bits: 89.8 E(32554): 5.6e-18 Smith-Waterman score: 1018; 44.3% identity (66.6% similar) in 461 aa overlap (1-433:1-441) 10 20 30 40 50 60 pF1KB9 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK ::::... : .:. : :. :. : :. .: ::. . . ::.:::: ::::: CCDS16 MVQQAESLE-AESNLPREALDTEEG-EF-MACSPVALDES-------DPDWCKTASGHIK 10 20 30 40 50 70 80 90 100 110 120 pF1KB9 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM ::::::::::.:::::::::::::::::::::::::::.::::.:::::::::::::::: CCDS16 RPMNAFMVWSKIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHM 60 70 80 90 100 110 130 140 150 160 170 180 pF1KB9 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG ::::::::::::: : . :.. .::..: ::. ..:.:::. :::.::.... :.. CCDS16 ADYPDYKYRPRKKPK---MDPSAKPSASQSP-EKS--AAGGGGGSAGGGAGGAKTSKGSS 120 130 140 150 160 190 200 210 220 230 pF1KB9 ---GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASF : . ..:..: . :. : :::: .. .::::.::.. . . CCDS16 KKCGKLKAPAAAGAKAGAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGGAGKTVKCVFLDE 170 180 190 200 210 220 240 250 260 270 280 pF1KB9 AAEQAGAAALLPLGAAAD---------HHSLYK--ARTPSASASASSAASASAALAAPGK .. : : . :..: . .. :: ..:.. :. : CCDS16 DDDDDDDDDELQLQIKQEPDEEDEEPPHQQLLQPPGQQPSQLLRRYNVAKVPAS---PTL 230 240 250 260 270 280 290 300 310 320 330 340 pF1KB9 HLAEKKVKRVYLFGGLGTSSSPAAGRSPADHRGYASLRAASPAPSSAP--SHASSSASSH . .. . . :. . .... .:: . . .. .. : : . : : ::: . : CCDS16 SSSAESPEGASLYDEVRAGATSGAGGGSRLYYSFKNITKQHPPPLAQPALSPASSRSVST 290 300 310 320 330 340 350 360 370 380 390 pF1KB9 SSSSSSSGSSSSDDEFEDDL---LDLNPSSNFESMS---LGSFSSSSAL-----DRDLDF ::::::..::.:. : ::: :.:: :.. .: : ::. .... : :.::: CCDS16 SSSSSSGSSSGSSGEDADDLMFDLSLNFSQSAHSASEQQLGGGAAAGNLSLSLVDKDLD- 350 360 370 380 390 400 400 410 420 430 pF1KB9 NFEPGS-GSHFEFPDYCTPEVSEMISGDWLESSISNLVFTY .: :: :::::::::::::.::::.:::::...:.::::: CCDS16 SFSEGSLGSHFEFPDYCTPELSEMIAGDWLEANFSDLVFTY 410 420 430 440 >>CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 (315 aa) initn: 862 init1: 580 opt: 647 Z-score: 367.2 bits: 76.6 E(32554): 3.9e-14 Smith-Waterman score: 740; 40.6% identity (55.1% similar) in 401 aa overlap (34-433:16-315) 10 20 30 40 50 60 pF1KB9 QTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIKRPM : :: . : : .:.::::::::::::: CCDS12 MVQQRGARAKRDGGPPPPGPGPAEEG-AREPGWCKTPSGHIKRPM 10 20 30 40 70 80 90 100 110 120 pF1KB9 NAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHMADY :::::::: ::::::.: :::::::::::::.::.::.::.::::.:::::::::::::: CCDS12 NAFMVWSQHERRKIMDQWPDMHNAEISKRLGRRWQLLQDSEKIPFVREAERLRLKHMADY 50 60 70 80 90 100 130 140 150 160 170 180 pF1KB9 PDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGGGGG :::::::::: :..: :...: : CCDS12 PDYKYRPRKK--------SKGAPAKARPRPPG---------------------------- 110 120 190 200 210 220 230 240 pF1KB9 ASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASFAAEQAG .::::. :: : .. : :: . ::: :: ::: . CCDS12 GSGGGSRLKP------GPQLPG-RGGRRA--------AGGPLGGGAAAPEDDD------- 130 140 150 160 250 260 270 280 290 300 pF1KB9 AAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVYLFGGLG : . : ..: . .::..: . .. .: . CCDS12 ---------EDDDEELLEVRL----------------VETPGRELWR------MVPAGRA 170 180 190 310 320 330 340 350 360 pF1KB9 TSSSPAAGRSPADHRGYASLRAASPAPSSAPSHASSSASSHSSSSSSSGSSSSDDEFEDD . .. ...:. . : :. ::::.:: . .. . . .: .: CCDS12 ARGQAERAQGPSGE-GAAAAAAASPTPSEDEEPEEEEEEAAAAEEGEEETVASGEESLGF 200 210 220 230 240 250 370 380 390 400 410 420 pF1KB9 LLDLNPSSNFESMSLGSFSSSSALDRDLDFNFEPGSG-SHFEFPDYCTPEVSEMISGDWL : : :. .: . :::::: :. .: :: :::::::::::::.:::.::: CCDS12 LSRLPPGPA----GL----DCSALDRDPDL--QPPSGTSHFEFPDYCTPEVTEMIAGDWR 260 270 280 290 300 430 pF1KB9 ESSISNLVFTY :::..::::: CCDS12 PSSIADLVFTY 310 433 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Thu Nov 3 23:20:50 2016 done: Thu Nov 3 23:20:51 2016 Total Scan time: 3.950 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]