FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE1687, 199 aa 1>>>pF1KE1687 199 - 199 aa - 199 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.7288+/-0.000899; mu= 9.9583+/- 0.054 mean_var=59.1395+/-11.841, 0's: 0 Z-trim(104.5): 33 B-trim: 27 in 1/49 Lambda= 0.166777 statistics sampled from 7905 (7931) to 7905 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.636), E-opt: 0.2 (0.244), width: 16 Scan time: 1.590 The best scores are: opt bits E(32554) CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 ( 199) 1363 336.4 7e-93 CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 ( 218) 266 72.5 2.2e-13 CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 ( 218) 264 72.0 3e-13 CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 ( 218) 256 70.1 1.2e-12 CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 ( 195) 255 69.8 1.2e-12 CCDS4944.1 GSTA2 gene_id:2939|Hs108|chr6 ( 222) 253 69.3 1.9e-12 CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 ( 191) 245 67.4 6.4e-12 CCDS4947.1 GSTA3 gene_id:2940|Hs108|chr6 ( 222) 245 67.4 7.4e-12 CCDS4945.1 GSTA1 gene_id:2938|Hs108|chr6 ( 222) 242 66.7 1.2e-11 CCDS4946.1 GSTA5 gene_id:221357|Hs108|chr6 ( 222) 241 66.4 1.4e-11 CCDS811.1 GSTM5 gene_id:2949|Hs108|chr1 ( 218) 239 66.0 2e-11 >>CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 (199 aa) initn: 1363 init1: 1363 opt: 1363 Z-score: 1781.0 bits: 336.4 E(32554): 7e-93 Smith-Waterman score: 1363; 100.0% identity (100.0% similar) in 199 aa overlap (1-199:1-199) 10 20 30 40 50 60 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPEIKSTLPFGKIPILEVDGLT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS36 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPEIKSTLPFGKIPILEVDGLT 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 LHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDFMSCFPWAEKKQDVKEQMFNELL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS36 LHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDFMSCFPWAEKKQDVKEQMFNELL 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 TYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDLLDNHPRLVTLRKK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS36 TYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDLLDNHPRLVTLRKK 130 140 150 160 170 180 190 pF1KE1 VQAIPAVANWIKRRPQTKL ::::::::::::::::::: CCDS36 VQAIPAVANWIKRRPQTKL 190 >>CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 (218 aa) initn: 229 init1: 116 opt: 266 Z-score: 353.8 bits: 72.5 E(32554): 2.2e-13 Smith-Waterman score: 266; 27.8% identity (57.6% similar) in 198 aa overlap (6-192:5-199) 10 20 30 40 50 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPE----------IKSTLPFGK : :...:: :. :: .. : : .::... ..: :. .: : : . CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 pF1KE1 IPILEVDGL-TLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDFMSCFPWAEKKQ .: : .:: . :: :: :.... .: :.:: :. .:: . . : . . . . CCDS80 LPYL-IDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSP 60 70 80 90 100 110 110 120 130 140 150 160 pF1KE1 DVKEQMFNELLTYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDLLD : :.. : : . : .:: .. .:: : :..:...:..:: .:.:. :: CCDS80 DF-EKLKPEYLE-ELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLD 120 130 140 150 160 170 170 180 190 pF1KE1 NHPRLVTLRKKVQAIPAVANWIKRRPQTKL : : . .. ... .. ..: CCDS80 AFPNLKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK 180 190 200 210 >>CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 (218 aa) initn: 209 init1: 96 opt: 264 Z-score: 351.2 bits: 72.0 E(32554): 3e-13 Smith-Waterman score: 264; 29.2% identity (57.9% similar) in 209 aa overlap (1-192:1-199) 10 20 30 40 50 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPE----------IKSTLPFGK :: . : :...:: :. :: .. : : .::... ..: :. .: : : . CCDS80 MP-MILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 pF1KE1 IPILEVDGL-TLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVD-TLDDFMS----CF-P .: : .:: . :: :: :.... .: :.:: :. .:: . . :.:. :. :. : CCDS80 LPYL-IDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNP 60 70 80 90 100 110 110 120 130 140 150 160 pF1KE1 WAEKKQDVKEQMFNELLTYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVF :: .: ....:: :. .. . .:: : :. ::..:..:: .: CCDS80 EFEK---LKPKYLEEL-----PEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIF 120 130 140 150 160 170 170 180 190 pF1KE1 KPDLLDNHPRLVTLRKKVQAIPAVANWIKRRPQTKL .: :: : : . .. ... .. ..: CCDS80 EPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK 180 190 200 210 >>CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 (218 aa) initn: 202 init1: 90 opt: 256 Z-score: 340.8 bits: 70.1 E(32554): 1.2e-12 Smith-Waterman score: 256; 28.2% identity (58.4% similar) in 209 aa overlap (1-192:1-199) 10 20 30 40 50 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPE----------IKSTLPFGK :: . : :.:.:: :. :: .. : : .::... ..: :. .: : : . CCDS80 MP-MTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 pF1KE1 IPILEVDGL-TLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDT-LDDFMS----CF-P .: : .:: . :: :: ::.... .: :..: :: . : . . .:. :. :. : CCDS80 LPYL-IDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCYDP 60 70 80 90 100 110 110 120 130 140 150 160 pF1KE1 WAEKKQDVKEQMFNELLTYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVF :: .: .... : :.... . .:: . :..:...:..:: :: CCDS80 DFEK---LKPEYLQAL-----PEMLKLYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVF 120 130 140 150 160 170 170 180 190 pF1KE1 KPDLLDNHPRLVTLRKKVQAIPAVANWIKRRPQTKL .:. :: : : . .. ... .. ..: CCDS80 EPSCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFTKMAVWGNK 180 190 200 210 >>CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 (195 aa) initn: 229 init1: 116 opt: 255 Z-score: 340.4 bits: 69.8 E(32554): 1.2e-12 Smith-Waterman score: 255; 30.0% identity (57.2% similar) in 180 aa overlap (6-174:5-181) 10 20 30 40 50 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPE----------IKSTLPFGK : :...:: :. :: .. : : .::... ..: :. .: : : . CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 pF1KE1 IPILEVDGL-TLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDFMSCFPWAEKKQ .: : .:: . :: :: :.... .: :.:: :. .:: . . : . . . . CCDS80 LPYL-IDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSP 60 70 80 90 100 110 110 120 130 140 150 160 pF1KE1 DVKEQMFNELLTYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDLLD : :.. : : . : .:: .. .:: : :..:...:..:: .:.:. :: CCDS80 DF-EKLKPEYLE-ELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLD 120 130 140 150 160 170 170 180 190 pF1KE1 NHPRLVTLRKKVQAIPAVANWIKRRPQTKL : : CCDS80 AFPNLKDFISRFEVSCGIM 180 190 >>CCDS4944.1 GSTA2 gene_id:2939|Hs108|chr6 (222 aa) initn: 124 init1: 69 opt: 253 Z-score: 336.8 bits: 69.3 E(32554): 1.9e-12 Smith-Waterman score: 255; 26.5% identity (63.7% similar) in 204 aa overlap (5-195:6-206) 10 20 30 40 50 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQA-DWPEIKST--LPFGKIPILEV :: : :.::: : ::...: ...:.. :..: : .... : : ..:..:. CCDS49 MAEKPKLHYSNIRGRMESIRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI 10 20 30 40 50 60 60 70 80 90 100 110 pF1KE1 DGLTLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDF---MSCFPWAE-KKQDVK ::. : :. :: :.... .: :. :. .: .. . :. . .:... ..::.: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKEKALIDMYIEGIADLGEMILLLPFSQPEEQDAK 70 80 90 100 110 120 120 130 140 150 160 170 pF1KE1 EQMFNELLTYNA--PHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDLLDN ...: : : : . . : .. :...:.::... ::.. . . .:... CCDS49 LALIQEK-TKNRYFPAFEKVLKSH--GQDYLVGNKLSRADIHLVELLYYVEELDSSLISS 130 140 150 160 170 180 190 pF1KE1 HPRLVTLRKKVQAIPAVANWIK----RRPQTKL : : .:. ... .:.: .... :.: CCDS49 FPLLKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEESRKIFRF 180 190 200 210 220 >>CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 (191 aa) initn: 202 init1: 90 opt: 245 Z-score: 327.5 bits: 67.4 E(32554): 6.4e-12 Smith-Waterman score: 245; 30.4% identity (58.1% similar) in 191 aa overlap (1-174:1-181) 10 20 30 40 50 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQADWPE----------IKSTLPFGK :: . : :.:.:: :. :: .. : : .::... ..: :. .: : : . CCDS44 MP-MTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 pF1KE1 IPILEVDGL-TLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDT-LDDFMS----CF-P .: : .:: . :: :: ::.... .: :..: :: . : . . .:. :. :. : CCDS44 LPYL-IDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCYDP 60 70 80 90 100 110 110 120 130 140 150 160 pF1KE1 WAEKKQDVKEQMFNELLTYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVF :: .: .... : :.... . .:: . :..:...:..:: :: CCDS44 DFEK---LKPEYLQAL-----PEMLKLYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVF 120 130 140 150 160 170 170 180 190 pF1KE1 KPDLLDNHPRLVTLRKKVQAIPAVANWIKRRPQTKL .:. :: : : CCDS44 EPSCLDAFPNLKDFISRFEHS 180 190 >>CCDS4947.1 GSTA3 gene_id:2940|Hs108|chr6 (222 aa) initn: 150 init1: 75 opt: 245 Z-score: 326.4 bits: 67.4 E(32554): 7.4e-12 Smith-Waterman score: 247; 27.1% identity (60.5% similar) in 210 aa overlap (5-195:6-206) 10 20 30 40 50 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQA-DWPEIKS--TLPFGKIPILEV :: ::: ::: : ::...: ...:.. : .: : .... .: : ..:..:. CCDS49 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI 10 20 30 40 50 60 60 70 80 90 100 pF1KE1 DGLTLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDFMS-------CFPWAEKKQ ::. : :. :: :.... .: :. :. .: .. . :. : : ... CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLNEMILLLPLCRP---EEK 70 80 90 100 110 110 120 130 140 150 160 pF1KE1 DVKEQMFNELL-TYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDL- :.: ...: . : . . :... :...:.::... :: : . :: . .: CCDS49 DAKIALIKEKTKSRYFPAFEKVLQSH--GQDYLVGNKLSRAD----ISLVELLYYVEELD 120 130 140 150 160 170 170 180 190 pF1KE1 ---LDNHPRLVTLRKKVQAIPAVANWIK----RRPQTKL ..: : : .:. ... .:.: .... :.: CCDS49 SSLISNFPLLKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF 180 190 200 210 220 >>CCDS4945.1 GSTA1 gene_id:2938|Hs108|chr6 (222 aa) initn: 126 init1: 71 opt: 242 Z-score: 322.5 bits: 66.7 E(32554): 1.2e-11 Smith-Waterman score: 245; 25.9% identity (60.0% similar) in 205 aa overlap (5-195:6-206) 10 20 30 40 50 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQA-DWPEIKST--LPFGKIPILEV :: ::: ::: : :...: ...:.. :..: : .... : : ..:..:. CCDS49 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI 10 20 30 40 50 60 60 70 80 90 100 110 pF1KE1 DGLTLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDF------MSCFPWAEKKQD ::. : :. :: :.... .: :. :. .: .. . :. . : :: : CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILLLPVCPPEEK--D 70 80 90 100 110 120 130 140 150 160 pF1KE1 VKEQMFNELLTYNA-PHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDLLD .: ...: . : . . : .. :...:.::... ::.. . . .:.. CCDS49 AKLALIKEKIKNRYFPAFEKVLKSH--GQDYLVGNKLSRADIHLVELLYYVEELDSSLIS 120 130 140 150 160 170 170 180 190 pF1KE1 NHPRLVTLRKKVQAIPAVANWIK----RRPQTKL . : : .:. ... .:.: .... :.: CCDS49 SFPLLKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF 180 190 200 210 220 >>CCDS4946.1 GSTA5 gene_id:221357|Hs108|chr6 (222 aa) initn: 131 init1: 58 opt: 241 Z-score: 321.2 bits: 66.4 E(32554): 1.4e-11 Smith-Waterman score: 241; 25.2% identity (60.4% similar) in 202 aa overlap (5-198:6-205) 10 20 30 40 50 pF1KE1 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQA-DWPEIKS--TLPFGKIPILEV :: : : :: : ::...: .. :.. .:.: : .... .: : ..:..:. CCDS49 MAEKPKLHYSNARGSMESIRWLLAAAGVELEEKFLESAEDLDKLRNDGSLLFQQVPMVEI 10 20 30 40 50 60 60 70 80 90 100 110 pF1KE1 DGLTLHQSLAIARYLTKNTDLAGNTEMEQCHVD----AIVDTLDDFMSCFPWAEKKQDVK ::. : :. :: :.... .: :. :. .: .::: . .. . ...:.: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDMKERALIDMYTEGIVDLTEMILLLLICQPEERDAK 70 80 90 100 110 120 120 130 140 150 160 170 pF1KE1 EQMFNELLTYNA-PHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDLLDNH . .: . : . . : .. ...:.::...:::.. . . .:... CCDS49 TALVKEKIKNRYFPAFEKVLKSHR--QDYLVGNKLSWADIHLVELFYYVEELDSSLISSF 130 140 150 160 170 180 190 pF1KE1 PRLVTLRKKVQAIPAVANWIKRRPQTKL : : .:. ... .:.: .... : : CCDS49 PLLKALKTRISNLPTVKKFLQPGSQRKPPMDEKSLEEARKIFRF 180 190 200 210 220 199 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sun Nov 6 17:23:43 2016 done: Sun Nov 6 17:23:43 2016 Total Scan time: 1.590 Total Display time: -0.010 Function used was FASTA [36.3.4 Apr, 2011]