FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE4494, 515 aa 1>>>pF1KE4494 515 - 515 aa - 515 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.8165+/-0.00103; mu= 14.5159+/- 0.062 mean_var=58.6720+/-11.928, 0's: 0 Z-trim(102.4): 24 B-trim: 0 in 0/49 Lambda= 0.167440 statistics sampled from 6923 (6928) to 6923 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.583), E-opt: 0.2 (0.213), width: 16 Scan time: 3.000 The best scores are: opt bits E(32554) CCDS44023.1 G6PD gene_id:2539|Hs108|chrX ( 515) 3486 850.9 0 CCDS14756.2 G6PD gene_id:2539|Hs108|chrX ( 545) 3486 850.9 0 CCDS101.1 H6PD gene_id:9563|Hs108|chr1 ( 791) 669 170.4 6.6e-42 CCDS72697.1 H6PD gene_id:9563|Hs108|chr1 ( 802) 669 170.4 6.7e-42 >>CCDS44023.1 G6PD gene_id:2539|Hs108|chrX (515 aa) initn: 3486 init1: 3486 opt: 3486 Z-score: 4546.8 bits: 850.9 E(32554): 0 Smith-Waterman score: 3486; 100.0% identity (100.0% similar) in 515 aa overlap (1-515:1-515) 10 20 30 40 50 60 pF1KE4 MAEQVALSRTQVCGILREELFQGDAFHQSDTHIFIIMGASGDLAKKKIYPTIWWLFRDGL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 MAEQVALSRTQVCGILREELFQGDAFHQSDTHIFIIMGASGDLAKKKIYPTIWWLFRDGL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE4 LPENTFIVGYARSRLTVADIRKQSEPFFKATPEEKLKLEDFFARNSYVAGQYDDAASYQR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 LPENTFIVGYARSRLTVADIRKQSEPFFKATPEEKLKLEDFFARNSYVAGQYDDAASYQR 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE4 LNSHMNALHLGSQANRLFYLALPPTVYEAVTKNIHESCMSQIGWNRIIVEKPFGRDLQSS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 LNSHMNALHLGSQANRLFYLALPPTVYEAVTKNIHESCMSQIGWNRIIVEKPFGRDLQSS 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE4 DRLSNHISSLFREDQIYRIDHYLGKEMVQNLMVLRFANRIFGPIWNRDNIACVILTFKEP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 DRLSNHISSLFREDQIYRIDHYLGKEMVQNLMVLRFANRIFGPIWNRDNIACVILTFKEP 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE4 FGTEGRGGYFDEFGIIRDVMQNHLLQMLCLVAMEKPASTNSDDVRDEKVKVLKCISEVQA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 FGTEGRGGYFDEFGIIRDVMQNHLLQMLCLVAMEKPASTNSDDVRDEKVKVLKCISEVQA 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE4 NNVVLGQYVGNPDGEGEATKGYLDDPTVPRGSTTATFAAVVLYVENERWDGVPFILRCGK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 NNVVLGQYVGNPDGEGEATKGYLDDPTVPRGSTTATFAAVVLYVENERWDGVPFILRCGK 310 320 330 340 350 360 370 380 390 400 410 420 pF1KE4 ALNERKAEVRLQFHDVAGDIFHQQCKRNELVIRVQPNEAVYTKMMTKKPGMFFNPEESEL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 ALNERKAEVRLQFHDVAGDIFHQQCKRNELVIRVQPNEAVYTKMMTKKPGMFFNPEESEL 370 380 390 400 410 420 430 440 450 460 470 480 pF1KE4 DLTYGNRYKNVKLPDAYERLILDVFCGSQMHFVRSDELREAWRIFTPLLHQIELEKPKPI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 DLTYGNRYKNVKLPDAYERLILDVFCGSQMHFVRSDELREAWRIFTPLLHQIELEKPKPI 430 440 450 460 470 480 490 500 510 pF1KE4 PYIYGSRGPTEADELMKRVGFQYEGTYKWVNPHKL ::::::::::::::::::::::::::::::::::: CCDS44 PYIYGSRGPTEADELMKRVGFQYEGTYKWVNPHKL 490 500 510 >>CCDS14756.2 G6PD gene_id:2539|Hs108|chrX (545 aa) initn: 3486 init1: 3486 opt: 3486 Z-score: 4546.3 bits: 850.9 E(32554): 0 Smith-Waterman score: 3486; 100.0% identity (100.0% similar) in 515 aa overlap (1-515:31-545) 10 20 30 pF1KE4 MAEQVALSRTQVCGILREELFQGDAFHQSD :::::::::::::::::::::::::::::: CCDS14 MGRRGSAPGNGRTLRGCERGGRRRRSADSVMAEQVALSRTQVCGILREELFQGDAFHQSD 10 20 30 40 50 60 40 50 60 70 80 90 pF1KE4 THIFIIMGASGDLAKKKIYPTIWWLFRDGLLPENTFIVGYARSRLTVADIRKQSEPFFKA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 THIFIIMGASGDLAKKKIYPTIWWLFRDGLLPENTFIVGYARSRLTVADIRKQSEPFFKA 70 80 90 100 110 120 100 110 120 130 140 150 pF1KE4 TPEEKLKLEDFFARNSYVAGQYDDAASYQRLNSHMNALHLGSQANRLFYLALPPTVYEAV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 TPEEKLKLEDFFARNSYVAGQYDDAASYQRLNSHMNALHLGSQANRLFYLALPPTVYEAV 130 140 150 160 170 180 160 170 180 190 200 210 pF1KE4 TKNIHESCMSQIGWNRIIVEKPFGRDLQSSDRLSNHISSLFREDQIYRIDHYLGKEMVQN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 TKNIHESCMSQIGWNRIIVEKPFGRDLQSSDRLSNHISSLFREDQIYRIDHYLGKEMVQN 190 200 210 220 230 240 220 230 240 250 260 270 pF1KE4 LMVLRFANRIFGPIWNRDNIACVILTFKEPFGTEGRGGYFDEFGIIRDVMQNHLLQMLCL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 LMVLRFANRIFGPIWNRDNIACVILTFKEPFGTEGRGGYFDEFGIIRDVMQNHLLQMLCL 250 260 270 280 290 300 280 290 300 310 320 330 pF1KE4 VAMEKPASTNSDDVRDEKVKVLKCISEVQANNVVLGQYVGNPDGEGEATKGYLDDPTVPR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 VAMEKPASTNSDDVRDEKVKVLKCISEVQANNVVLGQYVGNPDGEGEATKGYLDDPTVPR 310 320 330 340 350 360 340 350 360 370 380 390 pF1KE4 GSTTATFAAVVLYVENERWDGVPFILRCGKALNERKAEVRLQFHDVAGDIFHQQCKRNEL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 GSTTATFAAVVLYVENERWDGVPFILRCGKALNERKAEVRLQFHDVAGDIFHQQCKRNEL 370 380 390 400 410 420 400 410 420 430 440 450 pF1KE4 VIRVQPNEAVYTKMMTKKPGMFFNPEESELDLTYGNRYKNVKLPDAYERLILDVFCGSQM :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 VIRVQPNEAVYTKMMTKKPGMFFNPEESELDLTYGNRYKNVKLPDAYERLILDVFCGSQM 430 440 450 460 470 480 460 470 480 490 500 510 pF1KE4 HFVRSDELREAWRIFTPLLHQIELEKPKPIPYIYGSRGPTEADELMKRVGFQYEGTYKWV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 HFVRSDELREAWRIFTPLLHQIELEKPKPIPYIYGSRGPTEADELMKRVGFQYEGTYKWV 490 500 510 520 530 540 pF1KE4 NPHKL ::::: CCDS14 NPHKL >>CCDS101.1 H6PD gene_id:9563|Hs108|chr1 (791 aa) initn: 650 init1: 193 opt: 669 Z-score: 865.8 bits: 170.4 E(32554): 6.6e-42 Smith-Waterman score: 785; 32.6% identity (62.4% similar) in 503 aa overlap (14-481:15-502) 10 20 30 40 50 pF1KE4 MAEQVALSRTQVCGILREELFQGDAFHQSDTHIFIIMGASGDLAKKKIYPTIWWLFRDG : :. . .:: : : .:..::.:::::: .. .. :. : CCDS10 MWNMLIVAMCLALLGCLQAQELQG---HVS----IILLGATGDLAKKYLWQGLFQLYLDE 10 20 30 40 50 60 70 80 90 100 110 pF1KE4 LLPENTFIV-GYARS------RLTVADIRKQSEPFFKATPEEKLKLEDFFARNSYVAGQY ..: : : . .: . ... : : .: . . .: : . : : CCDS10 AGRGHSFSFHGAALTAPKQGQELMAKALESLSCPK-DMAPSHCAEHKDQFLQLSQYR-QL 60 70 80 90 100 110 120 130 140 150 160 pF1KE4 DDAASYQRLNSHMNAL--HLG-SQANRLFYLALPPTVYEAVTKNIHESCMSQIG-WNRII : .:: ::. ..: : : .:.:.::...:: .:: ...::. :: : : :.. CCDS10 KTAEDYQALNKDIEAQLQHAGLREAGRIFYFSVPPFAYEDIARNINSSCRPGPGAWLRVV 120 130 140 150 160 170 170 180 190 200 210 220 pF1KE4 VEKPFGRDLQSSDRLSNHISSLFREDQIYRIDHYLGKEMVQNLMVLRFANR-IFGPIWNR .:::::.: :...:.......:.:...::.::::::. : ... .: :: . .::: CCDS10 LEKPFGHDHFSAQQLATELGTFFQEEEMYRVDHYLGKQAVAQILPFRDQNRKALDGLWNR 180 190 200 210 220 230 230 240 250 260 270 280 pF1KE4 DNIACVILTFKEPFGTEGRGGYFDEFGIIRDVMQNHLLQMLCLVAMEKPASTNSDD-VRD .. : . .:: .::: ....:.:.::::.:::: ..: ::::: : ...: . : CCDS10 HHVERVEIIMKETVDAEGRTSFYEEYGVIRDVLQNHLTEVLTLVAMELPHNVSSAEAVLR 240 250 260 270 280 290 290 300 310 320 330 340 pF1KE4 EKVKVLKCISEVQANNVVLGQYVGNPDGEGEATKGYLDDPTVPRGSTTATFAAVVLYVEN .:..:.. . .: ...:.::: .. .: .. :. : . : : :::::.....: CCDS10 HKLQVFQALRGLQRGSAVVGQY----QSYSEQVRRELQKPDSFH-SLTPTFAAVLVHIDN 300 310 320 330 340 350 360 370 380 390 pF1KE4 ERWDGVPFILRCGKALNERKAEVRLQFHDVAGDI--------FHQQCKRNELVIRVQPNE ::.:::::: ::::.:: . .:. :.. : . ..:: .::... .. CCDS10 LRWEGVPFILMSGKALDERVGYARILFKNQACCVQSEKHWAAAQSQCLPRQLVFHIGHGD 350 360 370 380 390 400 400 410 420 430 440 pF1KE4 ----AVYTKMMTKKPGMFFNPEESE----LDLTYGN------RYKNVKLPDAYERLILDV :: .. .:.. . .: : : : .:. :. :. ::. :. . CCDS10 LGSPAVLVSRNLFRPSLPSSWKEMEGPPGLRL-FGSPLSDYYAYSPVRERDAHSVLLSHI 410 420 430 440 450 460 450 460 470 480 490 500 pF1KE4 FCGSQMHFVRSDELREAWRIFTPLLHQIELEKPKPIPYIYGSRGPTEADELMKRVGFQYE : : . :. ...: .: ..::::... . :. : CCDS10 FHGRKNFFITTENLLASWNFWTPLLESLAHKAPRLYPGGAENGRLLDFEFSSGRLFFSQQ 470 480 490 500 510 520 510 pF1KE4 GTYKWVNPHKL CCDS10 QPEQLVPGPGPAPMPSDFQVLRAKYRESPLVSAWSEELISKLANDIEATAVRAVRRFGQF 530 540 550 560 570 580 >>CCDS72697.1 H6PD gene_id:9563|Hs108|chr1 (802 aa) initn: 650 init1: 193 opt: 669 Z-score: 865.7 bits: 170.4 E(32554): 6.7e-42 Smith-Waterman score: 785; 32.6% identity (62.4% similar) in 503 aa overlap (14-481:26-513) 10 20 30 40 pF1KE4 MAEQVALSRTQVCGILREELFQGDAFHQSDTHIFIIMGASGDLAKKKI : :. . .:: : : .:..::.:::::: . CCDS72 MLAEPFNWHPGMWNMLIVAMCLALLGCLQAQELQG---HVS----IILLGATGDLAKKYL 10 20 30 40 50 50 60 70 80 90 100 pF1KE4 YPTIWWLFRDGLLPENTFIV-GYARS------RLTVADIRKQSEPFFKATPEEKLKLEDF . .. :. : ..: : : . .: . ... : : .: . . .: CCDS72 WQGLFQLYLDEAGRGHSFSFHGAALTAPKQGQELMAKALESLSCPK-DMAPSHCAEHKDQ 60 70 80 90 100 110 110 120 130 140 150 pF1KE4 FARNSYVAGQYDDAASYQRLNSHMNAL--HLG-SQANRLFYLALPPTVYEAVTKNIHESC : . : : : .:: ::. ..: : : .:.:.::...:: .:: ...::. :: CCDS72 FLQLSQYR-QLKTAEDYQALNKDIEAQLQHAGLREAGRIFYFSVPPFAYEDIARNINSSC 120 130 140 150 160 170 160 170 180 190 200 210 pF1KE4 MSQIG-WNRIIVEKPFGRDLQSSDRLSNHISSLFREDQIYRIDHYLGKEMVQNLMVLRFA : : :...:::::.: :...:.......:.:...::.::::::. : ... .: CCDS72 RPGPGAWLRVVLEKPFGHDHFSAQQLATELGTFFQEEEMYRVDHYLGKQAVAQILPFRDQ 180 190 200 210 220 230 220 230 240 250 260 270 pF1KE4 NR-IFGPIWNRDNIACVILTFKEPFGTEGRGGYFDEFGIIRDVMQNHLLQMLCLVAMEKP :: . .::: .. : . .:: .::: ....:.:.::::.:::: ..: ::::: : CCDS72 NRKALDGLWNRHHVERVEIIMKETVDAEGRTSFYEEYGVIRDVLQNHLTEVLTLVAMELP 240 250 260 270 280 290 280 290 300 310 320 330 pF1KE4 ASTNSDD-VRDEKVKVLKCISEVQANNVVLGQYVGNPDGEGEATKGYLDDPTVPRGSTTA ...: . : .:..:.. . .: ...:.::: .. .: .. :. : . : : CCDS72 HNVSSAEAVLRHKLQVFQALRGLQRGSAVVGQY----QSYSEQVRRELQKPDSFH-SLTP 300 310 320 330 340 340 350 360 370 380 pF1KE4 TFAAVVLYVENERWDGVPFILRCGKALNERKAEVRLQFHDVAGDI--------FHQQCKR :::::.....: ::.:::::: ::::.:: . .:. :.. : . ..:: CCDS72 TFAAVLVHIDNLRWEGVPFILMSGKALDERVGYARILFKNQACCVQSEKHWAAAQSQCLP 350 360 370 380 390 400 390 400 410 420 430 pF1KE4 NELVIRVQPNE----AVYTKMMTKKPGMFFNPEESE----LDLTYGN------RYKNVKL .::... .. :: .. .:.. . .: : : : .:. :. :. CCDS72 RQLVFHIGHGDLGSPAVLVSRNLFRPSLPSSWKEMEGPPGLRL-FGSPLSDYYAYSPVRE 410 420 430 440 450 460 440 450 460 470 480 490 pF1KE4 PDAYERLILDVFCGSQMHFVRSDELREAWRIFTPLLHQIELEKPKPIPYIYGSRGPTEAD ::. :. .: : . :. ...: .: ..::::... . :. : CCDS72 RDAHSVLLSHIFHGRKNFFITTENLLASWNFWTPLLESLAHKAPRLYPGGAENGRLLDFE 470 480 490 500 510 520 500 510 pF1KE4 ELMKRVGFQYEGTYKWVNPHKL CCDS72 FSSGRLFFSQQQPEQLVPGPGPAPMPSDFQVLRAKYRESPLVSAWSEELISKLANDIEAT 530 540 550 560 570 580 515 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sun Nov 6 00:46:00 2016 done: Sun Nov 6 00:46:01 2016 Total Scan time: 3.000 Total Display time: 0.030 Function used was FASTA [36.3.4 Apr, 2011]