FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE1322, 218 aa 1>>>pF1KE1322 218 - 218 aa - 218 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.8283+/-0.000814; mu= 10.5453+/- 0.049 mean_var=61.9407+/-12.412, 0's: 0 Z-trim(106.6): 20 B-trim: 8 in 1/50 Lambda= 0.162962 statistics sampled from 9051 (9068) to 9051 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.667), E-opt: 0.2 (0.279), width: 16 Scan time: 1.890 The best scores are: opt bits E(32554) CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 ( 218) 1493 359.4 1e-99 CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 ( 218) 1332 321.5 2.5e-88 CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 ( 195) 1300 314.0 4.2e-86 CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 ( 218) 1281 309.5 1e-84 CCDS811.1 GSTM5 gene_id:2949|Hs108|chr1 ( 218) 1234 298.5 2.2e-81 CCDS812.1 GSTM3 gene_id:2947|Hs108|chr1 ( 225) 1112 269.8 9.7e-73 CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 ( 191) 1107 268.6 1.9e-72 CCDS810.1 GSTM1 gene_id:2944|Hs108|chr1 ( 181) 918 224.2 4.2e-59 CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 ( 210) 345 89.5 1.7e-18 CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 ( 199) 266 70.9 6.5e-13 >>CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 (218 aa) initn: 1493 init1: 1493 opt: 1493 Z-score: 1903.7 bits: 359.4 E(32554): 1e-99 Smith-Waterman score: 1493; 100.0% identity (100.0% similar) in 218 aa overlap (1-218:1-218) 10 20 30 40 50 60 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK :::::::::::::::::::::::::::::::::::::: CCDS80 LKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK 190 200 210 >>CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 (218 aa) initn: 1332 init1: 1332 opt: 1332 Z-score: 1699.1 bits: 321.5 E(32554): 2.5e-88 Smith-Waterman score: 1332; 86.7% identity (95.9% similar) in 218 aa overlap (1-218:1-218) 10 20 30 40 50 60 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL : : :::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF :::::::::::::::::::::::::::::::::::::::::::.:: ::. .::.:.: CCDS80 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN :::::.:::::: .. .:.::::::::.:.::::::::.::::::::::::.::::::: CCDS80 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK :::::::::::::::::::::::::.:.....:::::: CCDS80 LKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK 190 200 210 >>CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 (195 aa) initn: 1300 init1: 1300 opt: 1300 Z-score: 1659.3 bits: 314.0 E(32554): 4.2e-86 Smith-Waterman score: 1300; 100.0% identity (100.0% similar) in 189 aa overlap (1-189:1-189) 10 20 30 40 50 60 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS80 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK ::::::::: CCDS80 LKDFISRFEVSCGIM 190 >>CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 (218 aa) initn: 1281 init1: 1281 opt: 1281 Z-score: 1634.3 bits: 309.5 E(32554): 1e-84 Smith-Waterman score: 1281; 83.0% identity (95.0% similar) in 218 aa overlap (1-218:1-218) 10 20 30 40 50 60 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL : ::::::.::::::.:::::::::::::::::::::::::::::::::::::::::::: CCDS80 MPMTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF ::::::.:::::::::: :::::::::::.:.:.:: :::::: :: :::..::.::: CCDS80 PYLIDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCYDPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN ::::::::. :: :.. .::::::.:::.:::::::::.:::::. ...:::.::::::: CCDS80 EKLKPEYLQALPEMLKLYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVFEPSCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK :::::::::::::::::::::::::.:..:..:::::: CCDS80 LKDFISRFEGLEKISAYMKSSRFLPRPVFTKMAVWGNK 190 200 210 >>CCDS811.1 GSTM5 gene_id:2949|Hs108|chr1 (218 aa) initn: 1208 init1: 1208 opt: 1234 Z-score: 1574.6 bits: 298.5 E(32554): 2.2e-81 Smith-Waterman score: 1234; 83.0% identity (93.1% similar) in 218 aa overlap (1-218:1-218) 10 20 30 40 50 60 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL : :::::::::::::::::::::::::: :::::.::::::::::::::::::::::::: CCDS81 MPMTLGYWDIRGLAHAIRLLLEYTDSSYVEKKYTLGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF ::::::::::::::::: :::::::::::::::::::::::::.:: .:.:.::.::: CCDS81 PYLIDGAHKITQSNAILRYIARKHNLCGETEEEKIRVDILENQVMDNHMELVRLCYDPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN :::::.:::::: .. .:.::::::::.::::::::::::::::..:::::.::::: : CCDS81 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGDKITFVDFLAYDVLDMKRIFEPKCLDAFLN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK :::::::::::.:::::::::.:: :. . :.:..: CCDS81 LKDFISRFEGLKKISAYMKSSQFLRGLLFGKSATWNSK 190 200 210 >>CCDS812.1 GSTM3 gene_id:2947|Hs108|chr1 (225 aa) initn: 1112 init1: 1112 opt: 1112 Z-score: 1419.4 bits: 269.8 E(32554): 9.7e-73 Smith-Waterman score: 1112; 72.4% identity (89.9% similar) in 217 aa overlap (2-218:6-222) 10 20 30 40 50 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD ::.::::::::::::::::::.::.:::::.:: :.::::::::::. :::: :: CCDS81 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD 10 20 30 40 50 60 60 70 80 90 100 110 pF1KE1 FPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCY :::::::.:: .::::::::: :::::::.:::::::::::::.:::.:: .:: :.:: CCDS81 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY 70 80 90 100 110 120 120 130 140 150 160 170 pF1KE1 SPDFEKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLD : : :::::.:::::: ....::.:::: ::.:.:.::::::.::.:: .:::.:.::: CCDS81 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD 130 140 150 160 170 180 180 190 200 210 pF1KE1 AFPNLKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK ::::: :. :::.::::.::..:..: :. ...: :::: CCDS81 EFPNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC 190 200 210 220 >>CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 (191 aa) initn: 1107 init1: 1107 opt: 1107 Z-score: 1414.2 bits: 268.6 E(32554): 1.9e-72 Smith-Waterman score: 1107; 83.1% identity (94.2% similar) in 189 aa overlap (1-189:1-189) 10 20 30 40 50 60 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL : ::::::.::::::.:::::::::::::::::::::::::::::::::::::::::::: CCDS44 MPMTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF ::::::.:::::::::: :::::::::::.:.:.:: :::::: :: :::..::.::: CCDS44 PYLIDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCYDPDF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN ::::::::. :: :.. .::::::.:::.:::::::::.:::::. ...:::.::::::: CCDS44 EKLKPEYLQALPEMLKLYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVFEPSCLDAFPN 130 140 150 160 170 180 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK ::::::::: CCDS44 LKDFISRFEHS 190 >>CCDS810.1 GSTM1 gene_id:2944|Hs108|chr1 (181 aa) initn: 924 init1: 906 opt: 918 Z-score: 1174.5 bits: 224.2 E(32554): 4.2e-59 Smith-Waterman score: 1005; 70.6% identity (78.9% similar) in 218 aa overlap (1-218:1-181) 10 20 30 40 50 60 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL : : :::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS81 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNL 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPDF :::::::::::::::::::::::::::::::::::::::::::.:: ::. .::.:.: CCDS81 PYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEF 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 EKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLDAFPN :::::.:::::: .. .:.::::::::.:.: CCDS81 EKLKPKYLEELPEKLKLYSEFLGKRPWFAGNK---------------------------- 130 140 150 190 200 210 pF1KE1 LKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK ::::::::::::::::.:.....:::::: CCDS81 ---------GLEKISAYMKSSRFLPRPVFSKMAVWGNK 160 170 180 >>CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 (210 aa) initn: 278 init1: 158 opt: 345 Z-score: 445.3 bits: 89.5 E(32554): 1.7e-18 Smith-Waterman score: 345; 29.2% identity (62.7% similar) in 209 aa overlap (4-208:5-204) 10 20 30 40 50 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN :. :. .:: :.:.:: .:..:. :. : . ..: . . . CCDS41 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCLYGQ 10 20 30 40 50 60 70 80 90 100 110 pF1KE1 LPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPD :: . :: . :::.:: ...: .: :. ..: ::.... . :. . . :. . CCDS41 LPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYT-N 60 70 80 90 100 110 120 130 140 150 160 170 pF1KE1 FEKLKPEYLEELPTMMQHF----SQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCL .: : .:.. :: ... : :: : . ..:::.:.:.:. :.: .:... :.:: CCDS41 YEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCL 120 130 140 150 160 170 180 190 200 210 pF1KE1 DAFPNLKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK :::: :. ...:. . :..:.. : ... :. CCDS41 DAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ 180 190 200 210 >>CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 (199 aa) initn: 229 init1: 116 opt: 266 Z-score: 345.3 bits: 70.9 E(32554): 6.5e-13 Smith-Waterman score: 266; 28.4% identity (57.4% similar) in 204 aa overlap (5-199:6-192) 10 20 30 40 50 pF1KE1 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN : :...:: :. :: .. : : .::... ..: :. .: : : . CCDS36 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPE----------IKSTLPFGK 10 20 30 40 50 60 70 80 90 100 110 pF1KE1 LPYL-IDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSP .: : .:: . :: :: :.... .: :.:: :. .:: . . : . :. : CCDS36 IPILEVDGL-TLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDFMS-----CF-P 60 70 80 90 100 120 130 140 150 160 170 pF1KE1 DFEK---LKPEYLEEL-----PTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIF :: .: ....:: : .:: .. .:: : :..:...:..:: .: CCDS36 WAEKKQDVKEQMFNELLTYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVF 110 120 130 140 150 160 180 190 200 210 pF1KE1 EPNCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK .:. :: : : . .. ... .. ..: CCDS36 KPDLLDNHPRLVTLRKKVQAIPAVANWIKRRPQTKL 170 180 190 218 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sun Nov 6 22:53:42 2016 done: Sun Nov 6 22:53:42 2016 Total Scan time: 1.890 Total Display time: -0.010 Function used was FASTA [36.3.4 Apr, 2011]