FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE6225, 222 aa 1>>>pF1KE6225 222 - 222 aa - 222 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.4933+/-0.000678; mu= 12.7734+/- 0.041 mean_var=58.8259+/-11.676, 0's: 0 Z-trim(109.0): 22 B-trim: 9 in 1/49 Lambda= 0.167221 statistics sampled from 10575 (10596) to 10575 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.707), E-opt: 0.2 (0.325), width: 16 Scan time: 1.550 The best scores are: opt bits E(32554) CCDS4947.1 GSTA3 gene_id:2940|Hs108|chr6 ( 222) 1432 353.3 7.3e-98 CCDS4945.1 GSTA1 gene_id:2938|Hs108|chr6 ( 222) 1292 319.5 1.1e-87 CCDS4944.1 GSTA2 gene_id:2939|Hs108|chr6 ( 222) 1263 312.5 1.4e-85 CCDS4946.1 GSTA5 gene_id:221357|Hs108|chr6 ( 222) 1217 301.4 3e-82 CCDS4948.1 GSTA4 gene_id:2941|Hs108|chr6 ( 222) 835 209.2 1.6e-54 CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 ( 210) 295 79.0 2.6e-15 CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 ( 199) 244 66.7 1.2e-11 >>CCDS4947.1 GSTA3 gene_id:2940|Hs108|chr6 (222 aa) initn: 1432 init1: 1432 opt: 1432 Z-score: 1870.5 bits: 353.3 E(32554): 7.3e-98 Smith-Waterman score: 1432; 99.5% identity (100.0% similar) in 222 aa overlap (1-222:1-222) 10 20 30 40 50 60 pF1KE6 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS49 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 DGIKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLNEMILLLPLCRPEEKDAK ::.::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLNEMILLLPLCRPEEKDAK 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE6 IALIKEKTKSRYFPAFEKVLQSHGQDYLVGNKLSRADISLVELLYYVEELDSSLISNFPL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS49 IALIKEKTKSRYFPAFEKVLQSHGQDYLVGNKLSRADISLVELLYYVEELDSSLISNFPL 130 140 150 160 170 180 190 200 210 220 pF1KE6 LKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF :::::::::::::::::::::::::::::::::::::::::: CCDS49 LKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF 190 200 210 220 >>CCDS4945.1 GSTA1 gene_id:2938|Hs108|chr6 (222 aa) initn: 1292 init1: 1292 opt: 1292 Z-score: 1687.9 bits: 319.5 E(32554): 1.1e-87 Smith-Waterman score: 1292; 90.1% identity (94.6% similar) in 222 aa overlap (1-222:1-222) 10 20 30 40 50 60 pF1KE6 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI :: ::::::::.::::: :::::::::::::::: ::::: :::::: ::::::::::: CCDS49 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 DGIKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLNEMILLLPLCRPEEKDAK ::.:::::::::::::::::::::::::::::::: ::.:::.:::::::.: ::::::: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILLLPVCPPEEKDAK 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE6 IALIKEKTKSRYFPAFEKVLQSHGQDYLVGNKLSRADISLVELLYYVEELDSSLISNFPL .:::::: :.::::::::::.::::::::::::::::: :::::::::::::::::.::: CCDS49 LALIKEKIKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL 130 140 150 160 170 180 190 200 210 220 pF1KE6 LKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF ::::::::::::::::::::::::::: : :.:::::::::: CCDS49 LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF 190 200 210 220 >>CCDS4944.1 GSTA2 gene_id:2939|Hs108|chr6 (222 aa) initn: 1263 init1: 1263 opt: 1263 Z-score: 1650.1 bits: 312.5 E(32554): 1.4e-85 Smith-Waterman score: 1263; 88.3% identity (94.6% similar) in 222 aa overlap (1-222:1-222) 10 20 30 40 50 60 pF1KE6 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI :: :::::: : ::::: ::::::::::::::::: ::::: :::::: ::::::::::: CCDS49 MAEKPKLHYSNIRGRMESIRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 DGIKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLNEMILLLPLCRPEEKDAK ::.:::::::::::::::::::::::::.:::::: ::.:::.:::::::. .:::.::: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKEKALIDMYIEGIADLGEMILLLPFSQPEEQDAK 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE6 IALIKEKTKSRYFPAFEKVLQSHGQDYLVGNKLSRADISLVELLYYVEELDSSLISNFPL .:::.::::.::::::::::.::::::::::::::::: :::::::::::::::::.::: CCDS49 LALIQEKTKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL 130 140 150 160 170 180 190 200 210 220 pF1KE6 LKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF ::::::::::::::::::::::::::: : :.:::.:::::: CCDS49 LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEESRKIFRF 190 200 210 220 >>CCDS4946.1 GSTA5 gene_id:221357|Hs108|chr6 (222 aa) initn: 1217 init1: 1217 opt: 1217 Z-score: 1590.1 bits: 301.4 E(32554): 3e-82 Smith-Waterman score: 1217; 85.1% identity (93.2% similar) in 222 aa overlap (1-222:1-222) 10 20 30 40 50 60 pF1KE6 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI :: :::::: :.:: :: :::::::::::.::::. ::::: ::::::::.::::::::: CCDS49 MAEKPKLHYSNARGSMESIRWLLAAAGVELEEKFLESAEDLDKLRNDGSLLFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 DGIKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLNEMILLLPLCRPEEKDAK ::.::::::::::::::::::::::.::::::::::::..::.:::::: .:.:::.::: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDMKERALIDMYTEGIVDLTEMILLLLICQPEERDAK 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE6 IALIKEKTKSRYFPAFEKVLQSHGQDYLVGNKLSRADISLVELLYYVEELDSSLISNFPL ::.::: :.::::::::::.:: :::::::::: ::: ::::.::::::::::::.::: CCDS49 TALVKEKIKNRYFPAFEKVLKSHRQDYLVGNKLSWADIHLVELFYYVEELDSSLISSFPL 130 140 150 160 170 180 190 200 210 220 pF1KE6 LKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF :::::::::::::::::::::: :::: : :.:::::::::: CCDS49 LKALKTRISNLPTVKKFLQPGSQRKPPMDEKSLEEARKIFRF 190 200 210 220 >>CCDS4948.1 GSTA4 gene_id:2941|Hs108|chr6 (222 aa) initn: 835 init1: 835 opt: 835 Z-score: 1092.1 bits: 209.2 E(32554): 1.6e-54 Smith-Waterman score: 835; 52.9% identity (83.7% similar) in 221 aa overlap (1-221:1-221) 10 20 30 40 50 60 pF1KE6 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI ::..::::: ::::::: .::.::::::::.:.:. . :.: ::.. . :.::::::::: CCDS49 MAARPKLHYPNGRGRMESVRWVLAAAGVEFDEEFLETKEQLYKLQDGNHLLFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 DGIKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLNEMILLLPLCRPEEKDAK ::.::::::.::.:::.:.::.::..:::.:::::.:: :: :.... :. .:.... . CCDS49 DGMKLVQTRSILHYIADKHNLFGKNLKERTLIDMYVEGTLDLLELLIMHPFLKPDDQQKE 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE6 IALIKEKTKSRYFPAFEKVLQSHGQDYLVGNKLSRADISLVELLYYVEELDSSLISNFPL .. . .:. ::::.:::.:..:::..::::.:: ::. :.. . .:: ...: ::. CCDS49 VVNMAQKAIIRYFPVFEKILRGHGQSFLVGNQLSLADVILLQTILALEEKIPNILSAFPF 130 140 150 160 170 180 190 200 210 220 pF1KE6 LKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF :. ...::.::.:.::.::: .::: : .. . .::: CCDS49 LQEYTVKLSNIPTIKRFLEPGSKKKPPPDEIYVRTVYNIFRP 190 200 210 220 >>CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 (210 aa) initn: 278 init1: 130 opt: 295 Z-score: 388.4 bits: 79.0 E(32554): 2.6e-15 Smith-Waterman score: 295; 32.8% identity (58.6% similar) in 198 aa overlap (9-203:8-197) 10 20 30 40 50 60 pF1KE6 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI :: ::: .: ::: : ..:. . ..: . .: .. :.: . CCDS41 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVV-TVETWQEGSLKASCLYGQLPKFQD 10 20 30 40 50 70 80 90 100 110 pF1KE6 DGIKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLN-EMILLLPLCRPEEKDA . : :. .:: ... .::::: .: ::.:: ..:. :: ..: :. :: CCDS41 GDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDD 60 70 80 90 100 110 120 130 140 150 160 170 pF1KE6 KIALIKEKTKSRYFPAFEKVLQSH--GQDYLVGNKLSRADISLVELLYYVEELDSSLISN . . . : : :: .:... :. ..::...: :: .:..:: : : . .. CCDS41 YVKALPGQLK----P-FETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDA 120 130 140 150 160 170 180 190 200 210 220 pF1KE6 FPLLKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF ::::.: :.: : .: :: .:: CCDS41 FPLLSAYVGRLSARPKLKAFL--ASPEYVNLPINGNGKQ 180 190 200 210 >>CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 (199 aa) initn: 149 init1: 75 opt: 244 Z-score: 322.3 bits: 66.7 E(32554): 1.2e-11 Smith-Waterman score: 246; 27.1% identity (62.3% similar) in 207 aa overlap (6-206:5-195) 10 20 30 40 50 60 pF1KE6 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI :: ::: ::: : ::...: ...:.. : .: : .... .: : ..:..:. CCDS36 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQA-DWPEIKS--TLPFGKIPILEV 10 20 30 40 50 70 80 90 100 110 120 pF1KE6 DGIKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLNEMILLLPLCRPEEKDAK ::. : :. :: :.... .: :. :. .: .. . :. . .: . ...:.: CCDS36 DGLTLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDF---MSCFPWAE-KKQDVK 60 70 80 90 100 110 130 140 150 160 170 pF1KE6 IALIKEKTKSRYFPAFEKVLQSH--GQDYLVGNKLSRAD----ISLVELLYYVEELDSSL ...: . : . . :... :...:.::... :: : . :: . . : CCDS36 EQMFNELL-TYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPD----L 120 130 140 150 160 180 190 200 210 220 pF1KE6 ISNFPLLKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF ..: : : .:. ... .:.: .... :.: CCDS36 LDNHPRLVTLRKKVQAIPAVANWIK----RRPQTKL 170 180 190 222 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Tue Nov 8 11:14:49 2016 done: Tue Nov 8 11:14:49 2016 Total Scan time: 1.550 Total Display time: -0.020 Function used was FASTA [36.3.4 Apr, 2011]