FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB9724, 418 aa 1>>>pF1KB9724 418 - 418 aa - 418 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 9.3927+/-0.00129; mu= 3.2094+/- 0.078 mean_var=460.1851+/-96.375, 0's: 0 Z-trim(114.0): 78 B-trim: 151 in 1/51 Lambda= 0.059787 statistics sampled from 14514 (14575) to 14514 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.746), E-opt: 0.2 (0.448), width: 16 Scan time: 2.980 The best scores are: opt bits E(32554) CCDS31996.1 POU4F1 gene_id:5457|Hs108|chr13 ( 419) 2817 257.2 2.1e-68 CCDS34074.1 POU4F2 gene_id:5458|Hs108|chr4 ( 409) 1028 102.9 5.9e-22 CCDS4281.1 POU4F3 gene_id:5459|Hs108|chr5 ( 338) 996 100.0 3.6e-21 CCDS30679.1 POU3F1 gene_id:5453|Hs108|chr1 ( 451) 631 68.7 1.3e-11 >>CCDS31996.1 POU4F1 gene_id:5457|Hs108|chr13 (419 aa) initn: 2170 init1: 2170 opt: 2817 Z-score: 1341.5 bits: 257.2 E(32554): 2.1e-68 Smith-Waterman score: 2817; 99.8% identity (99.8% similar) in 419 aa overlap (1-418:1-419) 10 20 30 40 50 60 pF1KB9 MMSMNSKQPHFAMHPTLPEHKYPSLHSSSEAIRRACLPTPPLQSNLFASLDETLLARAEA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS31 MMSMNSKQPHFAMHPTLPEHKYPSLHSSSEAIRRACLPTPPLQSNLFASLDETLLARAEA 10 20 30 40 50 60 70 80 90 100 110 pF1KB9 LAAVDIAVSQGKSHPFKPDATYHTMNSVPCTSTSTVPLAHHHHHHHH-QALEPGDLLDHI ::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::: CCDS31 LAAVDIAVSQGKSHPFKPDATYHTMNSVPCTSTSTVPLAHHHHHHHHHQALEPGDLLDHI 70 80 90 100 110 120 120 130 140 150 160 170 pF1KB9 SSPSLALMAGAGGAGAAAGGGGAHDGPGGGGGPGGGGGPGGGPGGGGGGGPGGGGGGPGG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS31 SSPSLALMAGAGGAGAAAGGGGAHDGPGGGGGPGGGGGPGGGPGGGGGGGPGGGGGGPGG 130 140 150 160 170 180 180 190 200 210 220 230 pF1KB9 GLLGGSAHPHPHMHSLGHLSHPAAAAAMNMPSGLPHPGLVAAAAHHGAAAAAAAAAAGQV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS31 GLLGGSAHPHPHMHSLGHLSHPAAAAAMNMPSGLPHPGLVAAAAHHGAAAAAAAAAAGQV 190 200 210 220 230 240 240 250 260 270 280 290 pF1KB9 AAASAAAAVVGAAGLASICDSDTDPRELEAFAERFKQRRIKLGVTQADVGSALANLKIPG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS31 AAASAAAAVVGAAGLASICDSDTDPRELEAFAERFKQRRIKLGVTQADVGSALANLKIPG 250 260 270 280 290 300 300 310 320 330 340 350 pF1KB9 VGSLSQSTICRFESLTLSHNNMIALKPILQAWLEEAEGAQREKMNKPELFNGGEKKRKRT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS31 VGSLSQSTICRFESLTLSHNNMIALKPILQAWLEEAEGAQREKMNKPELFNGGEKKRKRT 310 320 330 340 350 360 360 370 380 390 400 410 pF1KB9 SIAAPEKRSLEAYFAVQPRPSSEKIAAIAEKLDLKKNVVRVWFCNQRQKQKRMKFSATY ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS31 SIAAPEKRSLEAYFAVQPRPSSEKIAAIAEKLDLKKNVVRVWFCNQRQKQKRMKFSATY 370 380 390 400 410 >>CCDS34074.1 POU4F2 gene_id:5458|Hs108|chr4 (409 aa) initn: 1166 init1: 964 opt: 1028 Z-score: 507.6 bits: 102.9 E(32554): 5.9e-22 Smith-Waterman score: 1292; 58.7% identity (67.6% similar) in 414 aa overlap (29-416:84-407) 10 20 30 40 50 pF1KB9 MMSMNSKQPHFAMHPTLPEHKYPSLHSSSEAIRRACLPTPPLQSNLFASLDETLLARA :::.::::::::: ::.:..:::.::::: CCDS34 GGGGGGGGGGGGGGGRSSSSSSSGSSGGGGSEAMRRACLPTPP--SNIFGGLDESLLARA 60 70 80 90 100 110 60 70 80 90 100 pF1KB9 EALAAVDIAVSQGKSH--------PFKPDATYHTMNSVPCTS---TSTVPLAH------- :::::::: :::.::: ::::::::::::..:::: .:.::..: CCDS34 EALAAVDI-VSQSKSHHHHPPHHSPFKPDATYHTMNTIPCTSAASSSSVPISHPSALAGT 120 130 140 150 160 170 110 120 130 140 150 pF1KB9 HHHHHHH--------QALEPGDLLDHISSPSLALMAGAGGAGAAAGGGGAHDGPGGGGGP ::::::: :::: :.::.:.: :.::: :: :: CCDS34 HHHHHHHHHHHHQPHQALE-GELLEHLS-PGLAL-------GAMAG-------------- 180 190 200 160 170 180 190 200 210 pF1KB9 GGGGGPGGGPGGGGGGGPGGGGGGPGGGLLGGSAHPHPHMHSLGHLSHPAAAAAMNMPSG : :.... :: ::: ... . . :: .: : CCDS34 ------------------------PDGAVVSTPAHA-PHMATMNPMHQ--AALSMAHAHG 210 220 230 240 220 230 240 250 260 270 pF1KB9 LPHPGLVAAAAHHGAAAAAAAAAAGQVAAASAAAAVVGAAGLASICDSDTDPRELEAFAE :: .: : . : :.:::.:::::: CCDS34 LP--------SHMGC-----------------------------MSDVDADPRDLEAFAE 250 260 280 290 300 310 320 330 pF1KB9 RFKQRRIKLGVTQADVGSALANLKIPGVGSLSQSTICRFESLTLSHNNMIALKPILQAWL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS34 RFKQRRIKLGVTQADVGSALANLKIPGVGSLSQSTICRFESLTLSHNNMIALKPILQAWL 270 280 290 300 310 320 340 350 360 370 380 390 pF1KB9 EEAEGAQREKMNKPELFNGGEKKRKRTSIAAPEKRSLEAYFAVQPRPSSEKIAAIAEKLD :::: ..:::..:::::::.::::::::::::::::::::::.::::::::::::::::: CCDS34 EEAEKSHREKLTKPELFNGAEKKRKRTSIAAPEKRSLEAYFAIQPRPSSEKIAAIAEKLD 330 340 350 360 370 380 400 410 pF1KB9 LKKNVVRVWFCNQRQKQKRMKFSATY :::::::::::::::::::::.:: CCDS34 LKKNVVRVWFCNQRQKQKRMKYSAGI 390 400 >>CCDS4281.1 POU4F3 gene_id:5459|Hs108|chr5 (338 aa) initn: 1298 init1: 945 opt: 996 Z-score: 493.6 bits: 100.0 E(32554): 3.6e-21 Smith-Waterman score: 1447; 60.5% identity (71.8% similar) in 425 aa overlap (1-418:1-338) 10 20 30 40 50 60 pF1KB9 MMSMNSKQPHFAMHPTLPEHKYPSLHSSSEAIRRACLPTPPLQSNLFASLDETLLARAEA ::.:::::: :.:::.: : :. ::::.:::.::.:::.: ::.:.:.:.::.::::::: CCDS42 MMAMNSKQP-FGMHPVLQEPKFSSLHSGSEAMRRVCLPAPQLQGNIFGSFDESLLARAEA 10 20 30 40 50 70 80 90 100 110 pF1KB9 LAAVDIAVSQGKSHPFKPDATYHTMNSVPCTSTS-TVPLAH------HHHHHHHQALEPG :::::: ::.::.::::::::::::.:::::::: :::..: : :: ::.:: : CCDS42 LAAVDI-VSHGKNHPFKPDATYHTMSSVPCTSTSSTVPISHPAALTSHPHHAVHQGLE-G 60 70 80 90 100 110 120 130 140 150 160 170 pF1KB9 DLLDHISSPSLALMAGAGGAGAAAGGGGAHDGPGGGGGPGGGGGPGGGPGGGGGGGPGGG :::.::: :.:.. .: CCDS42 DLLEHIS-PTLSV---------------------------------------------SG 120 130 180 190 200 210 220 230 pF1KB9 GGGPGGGLLGGSAHPHPHMHSLGHLSHPAAAAAMNMPSGLPHPGLVAAAAHHGAAAAAAA :.: ... .. ::: :. ..::: : : :. :: :: :.: : CCDS42 LGAPEHSVMPAQIHPH-HLGAMGHL-HQAM--------GMSHPHTVAP---HSAMPAC-- 140 150 160 170 240 250 260 270 280 290 pF1KB9 AAAGQVAAASAAAAVVGAAGLASICDSDTDPRELEAFAERFKQRRIKLGVTQADVGSALA . : ..:::::::::::::::::::::::::::.::: CCDS42 -----------------------LSDVESDPRELEAFAERFKQRRIKLGVTQADVGAALA 180 190 200 210 300 310 320 330 340 350 pF1KB9 NLKIPGVGSLSQSTICRFESLTLSHNNMIALKPILQAWLEEAEGAQREKMNKPELFNGGE :::::::::::::::::::::::::::::::::.:::::::::.: ::: .:::::::.: CCDS42 NLKIPGVGSLSQSTICRFESLTLSHNNMIALKPVLQAWLEEAEAAYREKNSKPELFNGSE 220 230 240 250 260 270 360 370 380 390 400 410 pF1KB9 KKRKRTSIAAPEKRSLEAYFAVQPRPSSEKIAAIAEKLDLKKNVVRVWFCNQRQKQKRMK .::::::::::::::::::::.:::::::::::::::::::::::::::::::::::::: CCDS42 RKRKRTSIAAPEKRSLEAYFAIQPRPSSEKIAAIAEKLDLKKNVVRVWFCNQRQKQKRMK 280 290 300 310 320 330 pF1KB9 FSATY .::.. CCDS42 YSAVH >>CCDS30679.1 POU3F1 gene_id:5453|Hs108|chr1 (451 aa) initn: 598 init1: 369 opt: 631 Z-score: 322.2 bits: 68.7 E(32554): 1.3e-11 Smith-Waterman score: 673; 44.2% identity (68.5% similar) in 292 aa overlap (129-416:120-401) 100 110 120 130 140 150 pF1KB9 AHHHHHHHHQALEPGDLLDHISSPSLALMAGAGGAGAA-AGGGGAHD-GPGGGGGPGGGG ::. :::: : :. :: ::. . .::..: CCDS30 LEHGKAGGGGTGRADDGGGGGGFHARLVHQGAAHAGAAWAQGSTAHHLGPAMSPSPGASG 90 100 110 120 130 140 160 170 180 190 200 210 pF1KB9 GPGGGPGG--GGGGGPGGGGGGPGGGLLGGSAHPHPHMHSLGHLSHPAAAAAMNMPSGLP : : : . .. ::::::: .: : .:.. : .: : : . :. :: : CCDS30 GHQPQPLGLYAQAAYPGGGGGGLAGMLAAGGGGAGPGLH---HALHEDGHEAQLEPSPPP 150 160 170 180 190 200 220 230 240 250 260 270 pF1KB9 HPGLVAAAAHHGAAAAAAAAAAGQVAAASAAAAVVGAAGLASICDSDTDPRELEAFAERF : : . : :. :.. :::: .:..... :: . . .:: .:: ::..: CCDS30 HLGAHGHAHGHAHAGGLHAAAAHLHPGAGGGGSSVGEHSDEDAPSSD----DLEQFAKQF 210 220 230 240 250 260 280 290 300 310 320 330 pF1KB9 KQRRIKLGVTQADVGSALANLKIPGVGSLSQSTICRFESLTLSHNNMIALKPILQAWLEE :::::::: :::::: ::..: : . .::.::::::.: :: .:: :::.:. :::: CCDS30 KQRRIKLGFTQADVGLALGTLY--G-NVFSQTTICRFEALQLSFKNMCKLKPLLNKWLEE 270 280 290 300 310 340 350 360 370 380 390 pF1KB9 AEGAQREKMNKPELFNGGEKKRKRTSIAAPEKRSLEAYFAVQPRPSSEKIAAIAEKLDLK ..... : .. :.:..::::: . : .::..: :.::...:...:..:.:. CCDS30 TDSSSGSPTNLDKIAAQGRKRKKRTSIEVGVKGALESHFLKCPKPSAHEITGLADSLQLE 320 330 340 350 360 370 400 410 pF1KB9 KNVVRVWFCNQRQKQKRMKFSATY :.::::::::.:::.::: .: CCDS30 KEVVRVWFCNRRQKEKRMTPAAGAGHPPMDDVYAPGELGPGGGGASPPSAPPPPPPAALH 380 390 400 410 420 430 418 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Tue Nov 8 07:45:34 2016 done: Tue Nov 8 07:45:34 2016 Total Scan time: 2.980 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]