FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE1714, 218 aa 1>>>pF1KE1714 218 - 218 aa - 218 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.6459+/-0.000822; mu= 11.3143+/- 0.049 mean_var=62.7706+/-12.397, 0's: 0 Z-trim(106.4): 24 B-trim: 0 in 0/50 Lambda= 0.161881 statistics sampled from 8932 (8953) to 8932 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.673), E-opt: 0.2 (0.275), width: 16 Scan time: 2.010 The best scores are: opt bits E(32554) CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 ( 218) 1499 358.6 1.8e-99 CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 ( 218) 1332 319.6 9.7e-88 CCDS811.1 GSTM5 gene_id:2949|Hs108|chr1 ( 218) 1299 311.9 2e-85 CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 ( 218) 1296 311.2 3.3e-85 CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 ( 195) 1161 279.6 9.3e-76 CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 ( 191) 1104 266.3 9.2e-72 CCDS812.1 GSTM3 gene_id:2947|Hs108|chr1 ( 225) 1104 266.3 1.1e-71 CCDS810.1 GSTM1 gene_id:2944|Hs108|chr1 ( 181) 1050 253.7 5.5e-68 CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 ( 210) 327 84.9 4.3e-17 CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 ( 199) 264 70.1 1.1e-12 >>CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 (218 aa) initn: 1499 init1: 1499 opt: 1499 Z-score: 1899.4 bits: 358.6 E(32554): 1.8e-99 Smith-Waterman score: 1499; 100.0% identity (100.0% similar) in 218 aa overlap (1-218:1-218) 10 20 30 40 50 60 pF1KE1 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK :::::::::::::::::::::::::::::::::::::: CCDS80 LKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK 190 200 210 >>CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 (218 aa) initn: 1332 init1: 1332 opt: 1332 Z-score: 1688.6 bits: 319.6 E(32554): 9.7e-88 Smith-Waterman score: 1332; 86.7% identity (95.9% similar) in 218 aa overlap (1-218:1-218) 10 20 30 40 50 60 pF1KE1 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL : : :::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF :::::::::::::::::::::::::::::::::::::::::::.:: ::. .::.:.: CCDS80 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPN :::::.:::::: .. .:.::::::::.:.::::::::.::::::::::::.::::::: CCDS80 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK :::::::::::::::::::::::::.:.....:::::: CCDS80 LKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK 190 200 210 >>CCDS811.1 GSTM5 gene_id:2949|Hs108|chr1 (218 aa) initn: 1269 init1: 1269 opt: 1299 Z-score: 1647.0 bits: 311.9 E(32554): 2e-85 Smith-Waterman score: 1299; 87.6% identity (95.4% similar) in 218 aa overlap (1-218:1-218) 10 20 30 40 50 60 pF1KE1 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL ::: :::::::::::::::::::::::: :::::.::::::::::::::::::::::::: CCDS81 MPMTLGYWDIRGLAHAIRLLLEYTDSSYVEKKYTLGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF ::::::::::::::::: :::::::::::::::::::::::::.:::::.: .::.:.: CCDS81 PYLIDGAHKITQSNAILRYIARKHNLCGETEEEKIRVDILENQVMDNHMELVRLCYDPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPN ::::::::::::::::::::::::::::::.::::::::.:::::..::::::::::: : CCDS81 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGDKITFVDFLAYDVLDMKRIFEPKCLDAFLN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK :::::::::::.:::::::::.:: .:.: :.:..: CCDS81 LKDFISRFEGLKKISAYMKSSQFLRGLLFGKSATWNSK 190 200 210 >>CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 (218 aa) initn: 1296 init1: 1296 opt: 1296 Z-score: 1643.2 bits: 311.2 E(32554): 3.3e-85 Smith-Waterman score: 1296; 84.4% identity (95.9% similar) in 218 aa overlap (1-218:1-218) 10 20 30 40 50 60 pF1KE1 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL ::: ::::.::::::.:::::::::::::::::::::::::::::::::::::::::::: CCDS80 MPMTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF ::::::.:::::::::: :::::::::::.:.:.:: :::::: ::..:::. .::.:.: CCDS80 PYLIDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCYDPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPN :::::.::. ::: :::::.::::.::: :.:::::::..::::. ...:::.::::::: CCDS80 EKLKPEYLQALPEMLKLYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVFEPSCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK :::::::::::::::::::::::::::::.:::::::: CCDS80 LKDFISRFEGLEKISAYMKSSRFLPRPVFTKMAVWGNK 190 200 210 >>CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 (195 aa) initn: 1161 init1: 1161 opt: 1161 Z-score: 1473.6 bits: 279.6 E(32554): 9.3e-76 Smith-Waterman score: 1161; 87.8% identity (95.2% similar) in 189 aa overlap (1-189:1-189) 10 20 30 40 50 60 pF1KE1 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL : : :::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF :::::::::::::::::::::::::::::::::::::::::::.:: ::. .::.:.: CCDS80 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPN :::::.:::::: .. .:.::::::::.:.::::::::.::::::::::::.::::::: CCDS80 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK ::::::::: CCDS80 LKDFISRFEVSCGIM 190 >>CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 (191 aa) initn: 1104 init1: 1104 opt: 1104 Z-score: 1401.8 bits: 266.3 E(32554): 9.2e-72 Smith-Waterman score: 1104; 82.5% identity (95.2% similar) in 189 aa overlap (1-189:1-189) 10 20 30 40 50 60 pF1KE1 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL ::: ::::.::::::.:::::::::::::::::::::::::::::::::::::::::::: CCDS44 MPMTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF ::::::.:::::::::: :::::::::::.:.:.:: :::::: ::..:::. .::.:.: CCDS44 PYLIDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCYDPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPN :::::.::. ::: :::::.::::.::: :.:::::::..::::. ...:::.::::::: CCDS44 EKLKPEYLQALPEMLKLYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVFEPSCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK ::::::::: CCDS44 LKDFISRFEHS 190 >>CCDS812.1 GSTM3 gene_id:2947|Hs108|chr1 (225 aa) initn: 1104 init1: 1104 opt: 1104 Z-score: 1400.6 bits: 266.3 E(32554): 1.1e-71 Smith-Waterman score: 1104; 73.1% identity (88.4% similar) in 216 aa overlap (3-218:7-222) 10 20 30 40 50 pF1KE1 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD :.::::::::::::::::::.::.:::::.:: :.::::::::::. :::: :: CCDS81 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD 10 20 30 40 50 60 60 70 80 90 100 110 pF1KE1 FPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICY :::::::.:: .::::::::: :::::::.:::::::::::::.:::.:: . :: .:: CCDS81 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY 70 80 90 100 110 120 120 130 140 150 160 170 pF1KE1 NPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLD . . :::::.:::::: .:: .: :::: ::::.:.::::::.::.:: .:::.::::: CCDS81 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD 130 140 150 160 170 180 180 190 200 210 pF1KE1 AFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK ::::: :. :::.::::.::..:..: :. .::: :::: CCDS81 EFPNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC 190 200 210 220 >>CCDS810.1 GSTM1 gene_id:2944|Hs108|chr1 (181 aa) initn: 1050 init1: 1050 opt: 1050 Z-score: 1334.0 bits: 253.7 E(32554): 5.5e-68 Smith-Waterman score: 1161; 83.0% identity (83.0% similar) in 218 aa overlap (1-218:1-181) 10 20 30 40 50 60 pF1KE1 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS81 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS81 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPN :::::::::::::::::::::::::::::::: CCDS81 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNK---------------------------- 130 140 150 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK ::::::::::::::::::::::::::::: CCDS81 ---------GLEKISAYMKSSRFLPRPVFSKMAVWGNK 160 170 180 >>CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 (210 aa) initn: 199 init1: 92 opt: 327 Z-score: 420.4 bits: 84.9 E(32554): 4.3e-17 Smith-Waterman score: 327; 28.4% identity (62.1% similar) in 211 aa overlap (2-208:3-204) 10 20 30 40 50 pF1KE1 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN :. . :. .:: :.:.:: .:..:. :. : . ..: . . . CCDS41 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCLYGQ 10 20 30 40 50 60 70 80 90 100 110 pF1KE1 LPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPE :: . :: . :::.:: ...: .: :. ..: ::.... . : . . . :. . CCDS41 LPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYT-N 60 70 80 90 100 110 120 130 140 150 160 170 pF1KE1 FEKLKPKYLEELPEKLK----LYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCL .: : :.. :: .:: : :. : . ...:..:.:.:. . :.: .:... : :: CCDS41 YEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCL 120 130 140 150 160 170 180 190 200 210 pF1KE1 DAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK :::: :. ...:. . :..:.. : ... :. CCDS41 DAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ 180 190 200 210 >>CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 (199 aa) initn: 209 init1: 96 opt: 264 Z-score: 341.3 bits: 70.1 E(32554): 1.1e-12 Smith-Waterman score: 264; 29.2% identity (57.4% similar) in 209 aa overlap (1-199:1-192) 10 20 30 40 50 pF1KE1 MPMI-LGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN :: : :...:: :. :: .. : : .::... ..: :. .: : : . CCDS36 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPE----------IKSTLPFGK 10 20 30 40 50 60 70 80 90 100 110 pF1KE1 LPYL-IDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNP .: : .:: . :: :: :.... .: :.:: :. .:: . . :.:. :. :. : CCDS36 IPILEVDGL-TLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVD-TLDDFMS----CF-P 60 70 80 90 100 120 130 140 150 160 170 pF1KE1 EFEK---LKPKYLEEL-----PEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIF :: .: ....:: :. .. . .:: : :. ::..:..:: .: CCDS36 WAEKKQDVKEQMFNELLTYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVF 110 120 130 140 150 160 180 190 200 210 pF1KE1 EPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK .: :: : : . .. ... .. ..: CCDS36 KPDLLDNHPRLVTLRKKVQAIPAVANWIKRRPQTKL 170 180 190 218 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sun Nov 6 19:00:05 2016 done: Sun Nov 6 19:00:06 2016 Total Scan time: 2.010 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]