FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE5247, 222 aa 1>>>pF1KE5247 222 - 222 aa - 222 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.1586+/-0.000667; mu= 14.8868+/- 0.041 mean_var=59.8219+/-11.873, 0's: 0 Z-trim(108.9): 20 B-trim: 0 in 0/52 Lambda= 0.165823 statistics sampled from 10519 (10538) to 10519 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.715), E-opt: 0.2 (0.324), width: 16 Scan time: 1.960 The best scores are: opt bits E(32554) CCDS4945.1 GSTA1 gene_id:2938|Hs108|chr6 ( 222) 1437 351.6 2.3e-97 CCDS4944.1 GSTA2 gene_id:2939|Hs108|chr6 ( 222) 1361 333.4 6.9e-92 CCDS4947.1 GSTA3 gene_id:2940|Hs108|chr6 ( 222) 1297 318.1 2.8e-87 CCDS4946.1 GSTA5 gene_id:221357|Hs108|chr6 ( 222) 1287 315.7 1.5e-86 CCDS4948.1 GSTA4 gene_id:2941|Hs108|chr6 ( 222) 842 209.3 1.6e-54 CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 ( 210) 293 77.9 5.4e-15 CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 ( 199) 242 65.7 2.4e-11 >>CCDS4945.1 GSTA1 gene_id:2938|Hs108|chr6 (222 aa) initn: 1437 init1: 1437 opt: 1437 Z-score: 1861.4 bits: 351.6 E(32554): 2.3e-97 Smith-Waterman score: 1437; 100.0% identity (100.0% similar) in 222 aa overlap (1-222:1-222) 10 20 30 40 50 60 pF1KE5 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS49 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILLLPVCPPEEKDAK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILLLPVCPPEEKDAK 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE5 LALIKEKIKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS49 LALIKEKIKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL 130 140 150 160 170 180 190 200 210 220 pF1KE5 LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF :::::::::::::::::::::::::::::::::::::::::: CCDS49 LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF 190 200 210 220 >>CCDS4944.1 GSTA2 gene_id:2939|Hs108|chr6 (222 aa) initn: 1361 init1: 1361 opt: 1361 Z-score: 1763.2 bits: 333.4 E(32554): 6.9e-92 Smith-Waterman score: 1361; 95.0% identity (96.8% similar) in 222 aa overlap (1-222:1-222) 10 20 30 40 50 60 pF1KE5 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI ::::::::: : :::::: ::::::::::::::::::::::::::::::::::::::::: CCDS49 MAEKPKLHYSNIRGRMESIRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILLLPVCPPEEKDAK ::::::::::::::::::::::::::::.::::::::::::::::::::: :::.::: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKEKALIDMYIEGIADLGEMILLLPFSQPEEQDAK 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE5 LALIKEKIKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL ::::.:: :::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS49 LALIQEKTKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL 130 140 150 160 170 180 190 200 210 220 pF1KE5 LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF :::::::::::::::::::::::::::::::::::.:::::: CCDS49 LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEESRKIFRF 190 200 210 220 >>CCDS4947.1 GSTA3 gene_id:2940|Hs108|chr6 (222 aa) initn: 1297 init1: 1297 opt: 1297 Z-score: 1680.4 bits: 318.1 E(32554): 2.8e-87 Smith-Waterman score: 1297; 90.5% identity (94.6% similar) in 222 aa overlap (1-222:1-222) 10 20 30 40 50 60 pF1KE5 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI :: ::::::::.::::: :::::::::::::::: ::::: :::::: ::::::::::: CCDS49 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILLLPVCPPEEKDAK ::::::::::::::::::::::::::::::::::: ::.:::.:::::::.: ::::::: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLNEMILLLPLCRPEEKDAK 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE5 LALIKEKIKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL .:::::: :.::::::::::.::::::::::::::::: :::::::::::::::::.::: CCDS49 IALIKEKTKSRYFPAFEKVLQSHGQDYLVGNKLSRADISLVELLYYVEELDSSLISNFPL 130 140 150 160 170 180 190 200 210 220 pF1KE5 LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF ::::::::::::::::::::::::::: : :.:::::::::: CCDS49 LKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF 190 200 210 220 >>CCDS4946.1 GSTA5 gene_id:221357|Hs108|chr6 (222 aa) initn: 1287 init1: 1287 opt: 1287 Z-score: 1667.5 bits: 315.7 E(32554): 1.5e-86 Smith-Waterman score: 1287; 90.1% identity (94.6% similar) in 222 aa overlap (1-222:1-222) 10 20 30 40 50 60 pF1KE5 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI ::::::::: :::: ::: ::::::::::.::::..:::::::::::: :.::::::::: CCDS49 MAEKPKLHYSNARGSMESIRWLLAAAGVELEEKFLESAEDLDKLRNDGSLLFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILLLPVCPPEEKDAK :::::::::::::::::::::::::.::::::::: :::.:: :::::: .: :::.::: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDMKERALIDMYTEGIVDLTEMILLLLICQPEERDAK 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE5 LALIKEKIKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL ::.::::::::::::::::::: :::::::::: ::::::::.:::::::::::::::: CCDS49 TALVKEKIKNRYFPAFEKVLKSHRQDYLVGNKLSWADIHLVELFYYVEELDSSLISSFPL 130 140 150 160 170 180 190 200 210 220 pF1KE5 LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF :::::::::::::::::::::: ::::::::::::::::::: CCDS49 LKALKTRISNLPTVKKFLQPGSQRKPPMDEKSLEEARKIFRF 190 200 210 220 >>CCDS4948.1 GSTA4 gene_id:2941|Hs108|chr6 (222 aa) initn: 842 init1: 842 opt: 842 Z-score: 1092.1 bits: 209.3 E(32554): 1.6e-54 Smith-Waterman score: 842; 53.8% identity (84.2% similar) in 221 aa overlap (1-221:1-221) 10 20 30 40 50 60 pF1KE5 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI :: .::::: :.::::::.::.::::::::.:.:... :.: ::.. ..:.::::::::: CCDS49 MAARPKLHYPNGRGRMESVRWVLAAAGVEFDEEFLETKEQLYKLQDGNHLLFQQVPMVEI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILLLPVCPPEEKDAK :::::::::.::.:::.:.::.::..:::.:::::.:: :: :.... : :.... . CCDS49 DGMKLVQTRSILHYIADKHNLFGKNLKERTLIDMYVEGTLDLLELLIMHPFLKPDDQQKE 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE5 LALIKEKIKNRYFPAFEKVLKSHGQDYLVGNKLSRADIHLVELLYYVEELDSSLISSFPL .. . .: ::::.:::.:..:::..::::.:: ::. :.. . .:: ...:.::. CCDS49 VVNMAQKAIIRYFPVFEKILRGHGQSFLVGNQLSLADVILLQTILALEEKIPNILSAFPF 130 140 150 160 170 180 190 200 210 220 pF1KE5 LKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF :. ...::.::.:.::.::: .::: :: .. . .::: CCDS49 LQEYTVKLSNIPTIKRFLEPGSKKKPPPDEIYVRTVYNIFRP 190 200 210 220 >>CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 (210 aa) initn: 284 init1: 132 opt: 293 Z-score: 382.7 bits: 77.9 E(32554): 5.4e-15 Smith-Waterman score: 293; 32.3% identity (60.1% similar) in 198 aa overlap (9-203:8-197) 10 20 30 40 50 60 pF1KE5 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI :: .::: . : ::: : ..:. . ..: .. . .. :.: . CCDS41 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVV-TVETWQEGSLKASCLYGQLPKFQD 10 20 30 40 50 70 80 90 100 110 pF1KE5 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLG-EMILLLPVCPPEEKDA . : :. .:: ... .::::: .: ::.:: .:. :: ..: :. . :: CCDS41 GDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDD 60 70 80 90 100 110 120 130 140 150 160 170 pF1KE5 KLALIKEKIKNRYFPAFEKVLKSH--GQDYLVGNKLSRADIHLVELLYYVEELDSSLISS . . ..: : :: .:... :. ..::...: :: .:..:: : : . ... CCDS41 YVKALPGQLK----P-FETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDA 120 130 140 150 160 170 180 190 200 210 220 pF1KE5 FPLLKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF ::::.: :.: : .: :: .:: CCDS41 FPLLSAYVGRLSARPKLKAFL--ASPEYVNLPINGNGKQ 180 190 200 210 >>CCDS3640.1 HPGDS gene_id:27306|Hs108|chr4 (199 aa) initn: 126 init1: 71 opt: 242 Z-score: 317.1 bits: 65.7 E(32554): 2.4e-11 Smith-Waterman score: 245; 25.6% identity (59.4% similar) in 207 aa overlap (6-206:5-195) 10 20 30 40 50 60 pF1KE5 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI :: ::: ::: : :...: ...:.. :..: : .... : : ..:..:. CCDS36 MPNYKLTYFNMRGRAEIIRYIFAYLDIQYEDHRIEQA-DWPEIKST--LPFGKIPILEV 10 20 30 40 50 70 80 90 100 110 120 pF1KE5 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLGEMILLLPVCPPEEKDAK ::. : :. :: :.... .: :. :. .: .. . :. . : :: CCDS36 DGLTLHQSLAIARYLTKNTDLAGNTEMEQCHVDAIVDTLDDF------MSCFPWAEKKQD 60 70 80 90 100 110 130 140 150 160 170 pF1KE5 LALIKEKIKNRYF----PAFEKVLKSH--GQDYLVGNKLSRADIHLVELLYYVEELDSSL .::.. :. . : . . : .. :...:.::... ::.. . . .: CCDS36 ---VKEQMFNELLTYNAPHLMQDLDTYLGGREWLIGNSVTWADFYWEICSTTLLVFKPDL 120 130 140 150 160 180 190 200 210 220 pF1KE5 ISSFPLLKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF ... : : .:. ... .:.: .... :.: CCDS36 LDNHPRLVTLRKKVQAIPAVANWIK----RRPQTKL 170 180 190 222 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Mon Nov 7 22:51:16 2016 done: Mon Nov 7 22:51:16 2016 Total Scan time: 1.960 Total Display time: -0.010 Function used was FASTA [36.3.4 Apr, 2011]