FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB9722, 410 aa 1>>>pF1KB9722 410 - 410 aa - 410 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 8.9914+/-0.00109; mu= 3.9713+/- 0.066 mean_var=362.9935+/-75.849, 0's: 0 Z-trim(114.2): 46 B-trim: 0 in 0/51 Lambda= 0.067317 statistics sampled from 14754 (14791) to 14754 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.76), E-opt: 0.2 (0.454), width: 16 Scan time: 2.370 The best scores are: opt bits E(32554) CCDS34074.1 POU4F2 gene_id:5458|Hs108|chr4 ( 409) 2750 280.8 1.7e-75 CCDS4281.1 POU4F3 gene_id:5459|Hs108|chr5 ( 338) 1208 130.9 1.8e-30 CCDS31996.1 POU4F1 gene_id:5457|Hs108|chr13 ( 419) 1028 113.5 3.7e-25 CCDS5040.1 POU3F2 gene_id:5454|Hs108|chr6 ( 443) 657 77.5 2.7e-14 CCDS58190.1 POU2F3 gene_id:25833|Hs108|chr11 ( 438) 544 66.6 5.4e-11 CCDS8431.1 POU2F3 gene_id:25833|Hs108|chr11 ( 436) 542 66.4 6.1e-11 >>CCDS34074.1 POU4F2 gene_id:5458|Hs108|chr4 (409 aa) initn: 2438 init1: 2438 opt: 2750 Z-score: 1469.1 bits: 280.8 E(32554): 1.7e-75 Smith-Waterman score: 2750; 99.8% identity (99.8% similar) in 410 aa overlap (1-410:1-409) 10 20 30 40 50 60 pF1KB9 MMMMSLNSKQAFSMPHGGSLHVEPKYSALHSTSPGSSAPIAPSASSPSSSSNAGGGGGGG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 MMMMSLNSKQAFSMPHGGSLHVEPKYSALHSTSPGSSAPIAPSASSPSSSSNAGGGGGGG 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB9 GGGGGGGGGRSSSSSSSGSSGGGGSEAMRRACLPTPPSNIFGGLDESLLARAEALAAVDI :::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 GGGGGGGG-RSSSSSSSGSSGGGGSEAMRRACLPTPPSNIFGGLDESLLARAEALAAVDI 70 80 90 100 110 130 140 150 160 170 180 pF1KB9 VSQSKSHHHHPPHHSPFKPDATYHTMNTIPCTSAASSSSVPISHPSALAGTHHHHHHHHH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 VSQSKSHHHHPPHHSPFKPDATYHTMNTIPCTSAASSSSVPISHPSALAGTHHHHHHHHH 120 130 140 150 160 170 190 200 210 220 230 240 pF1KB9 HHHQPHQALEGELLEHLSPGLALGAMAGPDGAVVSTPAHAPHMATMNPMHQAALSMAHAH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 HHHQPHQALEGELLEHLSPGLALGAMAGPDGAVVSTPAHAPHMATMNPMHQAALSMAHAH 180 190 200 210 220 230 250 260 270 280 290 300 pF1KB9 GLPSHMGCMSDVDADPRDLEAFAERFKQRRIKLGVTQADVGSALANLKIPGVGSLSQSTI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 GLPSHMGCMSDVDADPRDLEAFAERFKQRRIKLGVTQADVGSALANLKIPGVGSLSQSTI 240 250 260 270 280 290 310 320 330 340 350 360 pF1KB9 CRFESLTLSHNNMIALKPILQAWLEEAEKSHREKLTKPELFNGAEKKRKRTSIAAPEKRS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 CRFESLTLSHNNMIALKPILQAWLEEAEKSHREKLTKPELFNGAEKKRKRTSIAAPEKRS 300 310 320 330 340 350 370 380 390 400 410 pF1KB9 LEAYFAIQPRPSSEKIAAIAEKLDLKKNVVRVWFCNQRQKQKRMKYSAGI :::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 LEAYFAIQPRPSSEKIAAIAEKLDLKKNVVRVWFCNQRQKQKRMKYSAGI 360 370 380 390 400 >>CCDS4281.1 POU4F3 gene_id:5459|Hs108|chr5 (338 aa) initn: 1377 init1: 979 opt: 1208 Z-score: 660.6 bits: 130.9 E(32554): 1.8e-30 Smith-Waterman score: 1506; 70.7% identity (86.3% similar) in 335 aa overlap (79-408:22-336) 50 60 70 80 90 100 pF1KB9 SSSNAGGGGGGGGGGGGGGGGRSSSSSSSGSSGGGGSEAMRRACLPTPP--SNIFGGLDE :: .:::::::.:::.: .::::..:: CCDS42 MMAMNSKQPFGMHPVLQEPKFSSLHSGSEAMRRVCLPAPQLQGNIFGSFDE 10 20 30 40 50 110 120 130 140 150 160 pF1KB9 SLLARAEALAAVDIVSQSKSHHHHPPHHSPFKPDATYHTMNTIPCTSAASSSSVPISHPS ::::::::::::::::..:.: :::::::::::...:::: .::.::::::. CCDS42 SLLARAEALAAVDIVSHGKNH--------PFKPDATYHTMSSVPCTS--TSSTVPISHPA 60 70 80 90 100 170 180 190 200 210 220 pF1KB9 ALAGTHHHHHHHHHHHHQPHQALEGELLEHLSPGLALGAMAGPDGAVVSTPAHAPHMATM ::.. : :: ::.:::.::::.:: :.......:. .:. . : :...: CCDS42 ALTS---------HPHHAVHQGLEGDLLEHISPTLSVSGLGAPEHSVMPAQIHPHHLGAM 110 120 130 140 150 230 240 250 260 270 280 pF1KB9 NPMHQAALSMAHAHGLPSHMG---CMSDVDADPRDLEAFAERFKQRRIKLGVTQADVGSA . .::: ..:.: : . : . :.:::..:::.:::::::::::::::::::::::.: CCDS42 GHLHQA-MGMSHPHTVAPHSAMPACLSDVESDPRELEAFAERFKQRRIKLGVTQADVGAA 160 170 180 190 200 210 290 300 310 320 330 340 pF1KB9 LANLKIPGVGSLSQSTICRFESLTLSHNNMIALKPILQAWLEEAEKSHREKLTKPELFNG :::::::::::::::::::::::::::::::::::.::::::::: ..::: .::::::: CCDS42 LANLKIPGVGSLSQSTICRFESLTLSHNNMIALKPVLQAWLEEAEAAYREKNSKPELFNG 220 230 240 250 260 270 350 360 370 380 390 400 pF1KB9 AEKKRKRTSIAAPEKRSLEAYFAIQPRPSSEKIAAIAEKLDLKKNVVRVWFCNQRQKQKR .:.::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS42 SERKRKRTSIAAPEKRSLEAYFAIQPRPSSEKIAAIAEKLDLKKNVVRVWFCNQRQKQKR 280 290 300 310 320 330 410 pF1KB9 MKYSAGI ::::: CCDS42 MKYSAVH >>CCDS31996.1 POU4F1 gene_id:5457|Hs108|chr13 (419 aa) initn: 1176 init1: 964 opt: 1028 Z-score: 565.1 bits: 113.5 E(32554): 3.7e-25 Smith-Waterman score: 1304; 58.9% identity (67.9% similar) in 414 aa overlap (85-408:29-417) 60 70 80 90 100 110 pF1KB9 GGGGGGGGGGGGGGGRSSSSSSSGSSGGGGSEAMRRACLPTPP--SNIFGGLDESLLARA :::.::::::::: ::.:..:::.::::: CCDS31 MMSMNSKQPHFAMHPTLPEHKYPSLHSSSEAIRRACLPTPPLQSNLFASLDETLLARA 10 20 30 40 50 120 130 140 150 160 170 pF1KB9 EALAAVDI-VSQSKSHHHHPPHHSPFKPDATYHTMNTIPCTSAASSSSVPISHPSALAGT :::::::: :::.::: ::::::::::::..:::: .:.::..: CCDS31 EALAAVDIAVSQGKSH--------PFKPDATYHTMNSVPCTS---TSTVPLAH------- 60 70 80 90 100 180 190 200 pF1KB9 HHHHHHHHHHHHQPHQALE-GELLEHLS-PGLAL-------GAMAG-------------- ::::::: ::::: :.::.:.: :.::: :: :: CCDS31 -----HHHHHHH--HQALEPGDLLDHISSPSLALMAGAGGAGAAAGGGGAHDGPGGGGGP 110 120 130 140 150 210 220 230 240 pF1KB9 ------------------------PDGAVVSTPAHA-PHMATMNPMHQ--AALSMAHAHG : :.... :: ::: ... . . :: .: : CCDS31 GGGGGPGGGPGGGGGGGPGGGGGGPGGGLLGGSAHPHPHMHSLGHLSHPAAAAAMNMPSG 160 170 180 190 200 210 250 260 pF1KB9 LP--------SHMGC-----------------------------MSDVDADPRDLEAFAE :: .: : . : :.:::.:::::: CCDS31 LPHPGLVAAAAHHGAAAAAAAAAAGQVAAASAAAAVVGAAGLASICDSDTDPRELEAFAE 220 230 240 250 260 270 270 280 290 300 310 320 pF1KB9 RFKQRRIKLGVTQADVGSALANLKIPGVGSLSQSTICRFESLTLSHNNMIALKPILQAWL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS31 RFKQRRIKLGVTQADVGSALANLKIPGVGSLSQSTICRFESLTLSHNNMIALKPILQAWL 280 290 300 310 320 330 330 340 350 360 370 380 pF1KB9 EEAEKSHREKLTKPELFNGAEKKRKRTSIAAPEKRSLEAYFAIQPRPSSEKIAAIAEKLD :::: ..:::..:::::::.::::::::::::::::::::::.::::::::::::::::: CCDS31 EEAEGAQREKMNKPELFNGGEKKRKRTSIAAPEKRSLEAYFAVQPRPSSEKIAAIAEKLD 340 350 360 370 380 390 390 400 410 pF1KB9 LKKNVVRVWFCNQRQKQKRMKYSAGI :::::::::::::::::::::.:: CCDS31 LKKNVVRVWFCNQRQKQKRMKFSATY 400 410 >>CCDS5040.1 POU3F2 gene_id:5454|Hs108|chr6 (443 aa) initn: 621 init1: 368 opt: 657 Z-score: 370.1 bits: 77.5 E(32554): 2.7e-14 Smith-Waterman score: 663; 35.2% identity (59.6% similar) in 406 aa overlap (22-409:41-417) 10 20 30 40 50 pF1KB9 MMMMSLNSKQAFSMPHGGSLHVEPKYSALHSTSPGSSAPIAPSASSPSSSS :. :.::.: .. :.. . . .. : CCDS50 LLTSSASIVHAEPPGGMQQGAGGYREAQSLVQGDYGALQS----NGHPLSHAHQWITALS 20 30 40 50 60 60 70 80 90 100 pF1KB9 NAGGGGGGGGGGGGGGGGRSSSSSSSGSSGGGGSEAMRRACLPTPPSNIF--GGLDESL- ..:::::::::::::::: .....: :.. :. .. :: . :: . : CCDS50 HGGGGGGGGGGGGGGGGGGGGGDGSPWSTSPLGQPDIK-------PSVVVQQGGRGDELH 70 80 90 100 110 110 120 130 140 150 160 pF1KB9 ----LARAEALAAVDIVSQSKSHHHHPPHHSPFKPDATYHTMNTIPCTSAASSSSVPISH : . . . .:....... .. : : ..:. : : .: :... CCDS50 GPGALQQQHQQQQQQQQQQQQQQQQQQQQQRP--PHLVHHAANHHPGPGAWRSAAAAAHL 120 130 140 150 160 170 170 180 190 200 210 220 pF1KB9 PSALAGTHHHHHHHHHHHHQPHQALEGELLEHLSPGLALGAMAGPDGAVVSTPAHAPHMA : ...... . :: ...: : : : : :: . :: : CCDS50 PPSMGASNGGLLYS-----QPSFTVNGML------G-AGGQPAGLHHHGLRDAHDEPHHA 180 190 200 210 220 230 240 250 260 270 pF1KB9 TMNPM-----HQAALSMAHAHGLPSHMGCMSDVDAD---PR--DLEAFAERFKQRRIKLG .: :: .: :.: : : .: : ::: ::..::::::::: CCDS50 DHHPHPHSHPHQQPPPPPPPQGPPGHPGAHHDPHSDEDTPTSDDLEQFAKQFKQRRIKLG 230 240 250 260 270 280 280 290 300 310 320 330 pF1KB9 VTQADVGSALANLKIPGVGSL-SQSTICRFESLTLSHNNMIALKPILQAWLEEAEKSHRE :::::: ::..: :.. ::.::::::.: :: .:: :::.:. :::::..: CCDS50 FTQADVGLALGTL----YGNVFSQTTICRFEALQLSFKNMCKLKPLLNKWLEEADSSSGS 290 300 310 320 330 340 340 350 360 370 380 390 pF1KB9 KLTKPELFNGAEKKRKRTSIAAPEKRSLEAYFAIQPRPSSEKIAAIAEKLDLKKNVVRVW . .. ..:..::::: . : .::..: :.::...:...:..:.:.:.::::: CCDS50 PTSIDKIAAQGRKRKKRTSIEVSVKGALESHFLKCPKPSAQEITSLADSLQLEKEVVRVW 350 360 370 380 390 400 400 410 pF1KB9 FCNQRQKQKRMKYSAGI :::.:::.::: .: CCDS50 FCNRRQKEKRMTPPGGTLPGAEDVYGGSRDTPPHHGVQTPVQ 410 420 430 440 >>CCDS58190.1 POU2F3 gene_id:25833|Hs108|chr11 (438 aa) initn: 493 init1: 216 opt: 544 Z-score: 310.9 bits: 66.6 E(32554): 5.4e-11 Smith-Waterman score: 544; 35.9% identity (64.1% similar) in 351 aa overlap (68-404:9-341) 40 50 60 70 80 90 pF1KB9 APIAPSASSPSSSSNAGGGGGGGGGGGGGGGGRSSSSSSSGSSGGGGSEAMRRACLPTPP :::. . :.. .. : : . . : CCDS58 MESPRTAKGGRDIKMSGDVAD----STDARSTLSQVEP 10 20 30 100 110 120 130 140 150 pF1KB9 SNIFGGLDESLLARAEALAAVDIVSQSKSHHHHPPHHS--PFKPDATYHT-MNTIPCTSA .: .::: . ..: :. : ..:. ::. : : : : ... . .:. :: . CCDS58 GNDRNGLDFNRQIKTEDLS--DSLQQTLSHR--PCHLSQGPAMMSGNQMSGLNASPCQDM 40 50 60 70 80 90 160 170 180 190 200 210 pF1KB9 ASSSSVPISHPSALAGTHHHHHHHHHHHHQP-HQALEGELLE--HLSPGLALGAMAGPDG :: :... . : . . . :: .:.:. .:: . . :: : ..:: : CCDS58 ASLH--PLQQLVLVPGHLQSVSQFLLSQTQPGQQGLQPNLLPFPQQQSGLLL-PQTGP-G 100 110 120 130 140 220 230 240 250 260 pF1KB9 AVVSTPAHAPHM--ATMNPMHQAALSMAHAHGLPSHMGCMSDVDADPRDLEAFAERFKQR . .. .: : . ....: .:. . . ::: : .: .: ..:: ::. :::: CCDS58 LASQAFGH-PGLPGSSLEPHLEASQHLPVPKHLPSSGG--ADEPSDLEELEKFAKTFKQR 150 160 170 180 190 200 270 280 290 300 310 320 pF1KB9 RIKLGVTQADVGSALANLKIPGVGSLSQSTICRFESLTLSHNNMIALKPILQAWLEEAEK ::::: ::.::: :.. :. : ...::.:: :::.:.:: .:: :::.:. ::..::. CCDS58 RIKLGFTQGDVGLAMG--KLYG-NDFSQTTISRFEALNLSFKNMCKLKPLLEKWLNDAES 210 220 230 240 250 260 330 340 350 360 370 380 pF1KB9 SHRE-KLTKPELFNG-----AEKKRKRTSIAAPEKRSLEAYFAIQPRPSSEKIAAIAEKL : . ... : . . ..:..::::: . . .:: : .:.::::.:. :::.: CCDS58 SPSDPSVSTPSSYPSLSEVFGRKRKKRTSIETNIRLTLEKRFQDNPKPSSEEISMIAEQL 270 280 290 300 310 320 390 400 410 pF1KB9 DLKKNVVRVWFCNQRQKQKRMKYSAGI ...:.::::::::.:::.::. CCDS58 SMEKEVVRVWFCNRRQKEKRINCPVATPIKPPVYNSRLVSPSGSLGPLSVPPVHSTMPGT 330 340 350 360 370 380 >>CCDS8431.1 POU2F3 gene_id:25833|Hs108|chr11 (436 aa) initn: 493 init1: 216 opt: 542 Z-score: 309.9 bits: 66.4 E(32554): 6.1e-11 Smith-Waterman score: 542; 37.3% identity (65.5% similar) in 322 aa overlap (97-404:32-339) 70 80 90 100 110 120 pF1KB9 GGGRSSSSSSSGSSGGGGSEAMRRACLPTPPSNIFGGLDESLLARAEALAAVDIVSQSKS :.: .::: . ..: :. : ..:. : CCDS84 VNLESMHTDIKMSGDVADSTDARSTLSQVEPGNDRNGLDFNRQIKTEDLS--DSLQQTLS 10 20 30 40 50 130 140 150 160 170 180 pF1KB9 HHHHPPHHS--PFKPDATYHT-MNTIPCTSAASSSSVPISHPSALAGTHHHHHHHHHHHH : .: : : : ... . .:. :: . :: :... . : . . . CCDS84 H--RPCHLSQGPAMMSGNQMSGLNASPCQDMASLH--PLQQLVLVPGHLQSVSQFLLSQT 60 70 80 90 100 110 190 200 210 220 230 pF1KB9 QP-HQALEGELLE--HLSPGLALGAMAGPDGAVVSTPAHAPHM--ATMNPMHQAALSMAH :: .:.:. .:: . . :: : ..:: : . .. .: : . ....: .:. . CCDS84 QPGQQGLQPNLLPFPQQQSGLLL-PQTGP-GLASQAFGH-PGLPGSSLEPHLEASQHLPV 120 130 140 150 160 170 240 250 260 270 280 290 pF1KB9 AHGLPSHMGCMSDVDADPRDLEAFAERFKQRRIKLGVTQADVGSALANLKIPGVGSLSQS . ::: : .: .: ..:: ::. ::::::::: ::.::: :.. :. : ...::. CCDS84 PKHLPSSGG--ADEPSDLEELEKFAKTFKQRRIKLGFTQGDVGLAMG--KLYG-NDFSQT 180 190 200 210 220 300 310 320 330 340 350 pF1KB9 TICRFESLTLSHNNMIALKPILQAWLEEAEKSHRE-KLTKPELFNG-----AEKKRKRTS :: :::.:.:: .:: :::.:. ::..::.: . ... : . . ..:..:::: CCDS84 TISRFEALNLSFKNMCKLKPLLEKWLNDAESSPSDPSVSTPSSYPSLSEVFGRKRKKRTS 230 240 250 260 270 280 360 370 380 390 400 410 pF1KB9 IAAPEKRSLEAYFAIQPRPSSEKIAAIAEKLDLKKNVVRVWFCNQRQKQKRMKYSAGI : . . .:: : .:.::::.:. :::.:...:.::::::::.:::.::. CCDS84 IETNIRLTLEKRFQDNPKPSSEEISMIAEQLSMEKEVVRVWFCNRRQKEKRINCPVATPI 290 300 310 320 330 340 CCDS84 KPPVYNSRLVSPSGSLGPLSVPPVHSTMPGTVTSSCSPGNNSRPSSPGSGLHASSPTASQ 350 360 370 380 390 400 410 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Tue Nov 8 04:27:39 2016 done: Tue Nov 8 04:27:39 2016 Total Scan time: 2.370 Total Display time: 0.010 Function used was FASTA [36.3.4 Apr, 2011]