FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE6165, 147 aa 1>>>pF1KE6165 147 - 147 aa - 147 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.3497+/-0.000723; mu= 10.8342+/- 0.044 mean_var=57.3602+/-11.517, 0's: 0 Z-trim(108.3): 13 B-trim: 41 in 1/49 Lambda= 0.169344 statistics sampled from 10091 (10103) to 10091 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.711), E-opt: 0.2 (0.31), width: 16 Scan time: 1.480 The best scores are: opt bits E(32554) CCDS31376.1 HBD gene_id:3045|Hs108|chr11 ( 147) 987 248.9 8.4e-67 CCDS7753.1 HBB gene_id:3043|Hs108|chr11 ( 147) 929 234.7 1.6e-62 CCDS7756.1 HBE1 gene_id:3046|Hs108|chr11 ( 147) 754 192.0 1.2e-49 CCDS7755.1 HBG2 gene_id:3048|Hs108|chr11 ( 147) 743 189.3 7.4e-49 CCDS7754.1 HBG1 gene_id:3047|Hs108|chr11 ( 147) 732 186.6 4.8e-48 CCDS10398.1 HBA2 gene_id:3040|Hs108|chr16 ( 142) 373 98.9 1.2e-21 CCDS10399.1 HBA1 gene_id:3039|Hs108|chr16 ( 142) 373 98.9 1.2e-21 CCDS10397.1 HBZ gene_id:3050|Hs108|chr16 ( 142) 330 88.4 1.7e-18 CCDS10400.1 HBQ1 gene_id:3049|Hs108|chr16 ( 142) 321 86.2 7.8e-18 CCDS32347.1 HBM gene_id:3042|Hs108|chr16 ( 141) 282 76.6 5.7e-15 >>CCDS31376.1 HBD gene_id:3045|Hs108|chr11 (147 aa) initn: 987 init1: 987 opt: 987 Z-score: 1312.8 bits: 248.9 E(32554): 8.4e-67 Smith-Waterman score: 987; 100.0% identity (100.0% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE6 MVHLTPEEKTAVNALWGKVNVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGNPK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS31 MVHLTPEEKTAVNALWGKVNVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 VKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARNFG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS31 VKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARNFG 70 80 90 100 110 120 130 140 pF1KE6 KEFTPQMQAAYQKVVAGVANALAHKYH ::::::::::::::::::::::::::: CCDS31 KEFTPQMQAAYQKVVAGVANALAHKYH 130 140 >>CCDS7753.1 HBB gene_id:3043|Hs108|chr11 (147 aa) initn: 929 init1: 929 opt: 929 Z-score: 1236.2 bits: 234.7 E(32554): 1.6e-62 Smith-Waterman score: 929; 93.2% identity (98.0% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE6 MVHLTPEEKTAVNALWGKVNVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGNPK :::::::::.::.::::::::: :::::::::::::::::::::::::::.::::::::: CCDS77 MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 VKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARNFG ::::::::::::::::::::::::::. ::::::::::::::::::::::::::::..:: CCDS77 VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFG 70 80 90 100 110 120 130 140 pF1KE6 KEFTPQMQAAYQKVVAGVANALAHKYH ::::: .:::::::::::::::::::: CCDS77 KEFTPPVQAAYQKVVAGVANALAHKYH 130 140 >>CCDS7756.1 HBE1 gene_id:3046|Hs108|chr11 (147 aa) initn: 754 init1: 754 opt: 754 Z-score: 1005.1 bits: 192.0 E(32554): 1.2e-49 Smith-Waterman score: 754; 72.8% identity (94.6% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE6 MVHLTPEEKTAVNALWGKVNVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGNPK :::.: :::.::..::.:.::. .:::::::::::::::::::.:::.::::.:..:::: CCDS77 MVHFTAEEKAAVTSLWSKMNVEEAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 VKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARNFG ::::::::: .:.:.. ..:::: .:..::::::::::::::::.:::::.: .:: .:: CCDS77 VKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATHFG 70 80 90 100 110 120 130 140 pF1KE6 KEFTPQMQAAYQKVVAGVANALAHKYH :::::..:::.::.:..:: ::::::: CCDS77 KEFTPEVQAAWQKLVSAVAIALAHKYH 130 140 >>CCDS7755.1 HBG2 gene_id:3048|Hs108|chr11 (147 aa) initn: 743 init1: 743 opt: 743 Z-score: 990.6 bits: 189.3 E(32554): 7.4e-49 Smith-Waterman score: 743; 72.1% identity (93.9% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE6 MVHLTPEEKTAVNALWGKVNVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGNPK : :.: :.:.....:::::::. .:::.:::::::::::::::.:::.::: .:.::::: CCDS77 MGHFTEEDKATITSLWGKVNVEDAGGETLGRLLVVYPWTQRFFDSFGNLSSASAIMGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 VKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARNFG ::::::::: ...:.. :::.:::::.:::::::::::::::::.::::::: ::: .:: CCDS77 VKAHGKKVLTSLGDAIKHLDDLKGTFAQLSELHCDKLHVDPENFKLLGNVLVTVLAIHFG 70 80 90 100 110 120 130 140 pF1KE6 KEFTPQMQAAYQKVVAGVANALAHKYH :::::..::..::.:.:::.::. .:: CCDS77 KEFTPEVQASWQKMVTGVASALSSRYH 130 140 >>CCDS7754.1 HBG1 gene_id:3047|Hs108|chr11 (147 aa) initn: 732 init1: 732 opt: 732 Z-score: 976.1 bits: 186.6 E(32554): 4.8e-48 Smith-Waterman score: 732; 71.4% identity (93.2% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE6 MVHLTPEEKTAVNALWGKVNVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGNPK : :.: :.:.....:::::::. .:::.:::::::::::::::.:::.::: .:.::::: CCDS77 MGHFTEEDKATITSLWGKVNVEDAGGETLGRLLVVYPWTQRFFDSFGNLSSASAIMGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE6 VKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARNFG ::::::::: ...:. :::.:::::.:::::::::::::::::.::::::: ::: .:: CCDS77 VKAHGKKVLTSLGDATKHLDDLKGTFAQLSELHCDKLHVDPENFKLLGNVLVTVLAIHFG 70 80 90 100 110 120 130 140 pF1KE6 KEFTPQMQAAYQKVVAGVANALAHKYH :::::..::..::.:..::.::. .:: CCDS77 KEFTPEVQASWQKMVTAVASALSSRYH 130 140 >>CCDS10398.1 HBA2 gene_id:3040|Hs108|chr16 (142 aa) initn: 305 init1: 263 opt: 373 Z-score: 502.3 bits: 98.9 E(32554): 1.2e-21 Smith-Waterman score: 373; 43.4% identity (74.5% similar) in 145 aa overlap (4-146:3-141) 10 20 30 40 50 pF1KE6 MVHLTPEEKTAVNALWGKVNVDA--VGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGN :.: .:: :.: ::::.. : :.::: :... .: :. .: : ::: :. CCDS10 MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF-DLSH-----GS 10 20 30 40 50 60 70 80 90 100 110 pF1KE6 PKVKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARN .::.::::: :.....::.:.. ...: ::.:: ::.::: ::.::.. :. .:: . CCDS10 AQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAH 60 70 80 90 100 110 120 130 140 pF1KE6 FGKEFTPQMQAAYQKVVAGVANALAHKYH . :::: ..:. .: .:.:...:. :: CCDS10 LPAEFTPAVHASLDKFLASVSTVLTSKYR 120 130 140 >>CCDS10399.1 HBA1 gene_id:3039|Hs108|chr16 (142 aa) initn: 305 init1: 263 opt: 373 Z-score: 502.3 bits: 98.9 E(32554): 1.2e-21 Smith-Waterman score: 373; 43.4% identity (74.5% similar) in 145 aa overlap (4-146:3-141) 10 20 30 40 50 pF1KE6 MVHLTPEEKTAVNALWGKVNVDA--VGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGN :.: .:: :.: ::::.. : :.::: :... .: :. .: : ::: :. CCDS10 MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF-DLSH-----GS 10 20 30 40 50 60 70 80 90 100 110 pF1KE6 PKVKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARN .::.::::: :.....::.:.. ...: ::.:: ::.::: ::.::.. :. .:: . CCDS10 AQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAH 60 70 80 90 100 110 120 130 140 pF1KE6 FGKEFTPQMQAAYQKVVAGVANALAHKYH . :::: ..:. .: .:.:...:. :: CCDS10 LPAEFTPAVHASLDKFLASVSTVLTSKYR 120 130 140 >>CCDS10397.1 HBZ gene_id:3050|Hs108|chr16 (142 aa) initn: 239 init1: 239 opt: 330 Z-score: 445.5 bits: 88.4 E(32554): 1.7e-18 Smith-Waterman score: 330; 38.6% identity (73.1% similar) in 145 aa overlap (4-146:3-141) 10 20 30 40 50 pF1KE6 MVHLTPEEKTAVNALWGKVNV--DAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGN :: :.: . ..:.:... :..: :.: ::.. .: :. .: : :: : :. CCDS10 MSLTKTERTIIVSMWAKISTQADTIGTETLERLFLSHPQTKTYFPHF-DLH-P----GS 10 20 30 40 50 60 70 80 90 100 110 pF1KE6 PKVKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARN ...:::.::..: .:.. .:.. :..:.::::: :.::: ::.::.. :. .:: CCDS10 AQLRAHGSKVVAAVGDAVKSIDDIGGALSKLSELHAYILRVDPVNFKLLSHCLLVTLAAR 60 70 80 90 100 110 120 130 140 pF1KE6 FGKEFTPQMQAAYQKVVAGVANALAHKYH : .:: . .::..: .. :...:..:: CCDS10 FPADFTAEAHAAWDKFLSVVSSVLTEKYR 120 130 140 >>CCDS10400.1 HBQ1 gene_id:3049|Hs108|chr16 (142 aa) initn: 319 init1: 226 opt: 321 Z-score: 433.6 bits: 86.2 E(32554): 7.8e-18 Smith-Waterman score: 321; 40.7% identity (70.3% similar) in 145 aa overlap (4-146:3-141) 10 20 30 40 50 pF1KE6 MVHLTPEEKTAVNALWGKV--NVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGN :. :... : ::: :. :: . ::: : ....: :. .: : ::: : :. CCDS10 MALSAEDRALVRALWKKLGSNVGVYTTEALERTFLAFPATKTYF-SHLDLS-P----GS 10 20 30 40 50 60 70 80 90 100 110 pF1KE6 PKVKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARN .:.:::.:: :.: .. .::.: ..: ::.:: .:.::: .:.:::. :. .:::. CCDS10 SQVRAHGQKVADALSLAVERLDDLPHALSALSHLHACQLRVDPASFQLLGHCLLVTLARH 60 70 80 90 100 110 120 130 140 pF1KE6 FGKEFTPQMQAAYQKVVAGVANALAHKYH . .:.: .::. .: .. : .::. .: CCDS10 YPGDFSPALQASLDKFLSHVISALVSEYR 120 130 140 >>CCDS32347.1 HBM gene_id:3042|Hs108|chr16 (141 aa) initn: 263 init1: 229 opt: 282 Z-score: 382.2 bits: 76.6 E(32554): 5.7e-15 Smith-Waterman score: 282; 36.6% identity (68.3% similar) in 145 aa overlap (4-146:2-140) 10 20 30 40 50 pF1KE6 MVHLTPEEKTAVNALWGKV--NVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGN :. .:.. . .: . . :.: : ::..::: :. .: .. . ::. CCDS32 MLSAQERAQIAQVWDLIAGHEAQFGAELLLRLFTVYPSTKVYFPHLS--ACQDAT--- 10 20 30 40 50 60 70 80 90 100 110 pF1KE6 PKVKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARN .. .::...:.: . .. :.:::....: :..:: :.::: :: :: . . ::: . CCDS32 -QLLSHGQRMLAAVGAAVQHVDNLRAALSPLADLHALVLRVDPANFPLLIQCFHVVLASH 60 70 80 90 100 110 120 130 140 pF1KE6 FGKEFTPQMQAAYQKVVAGVANALAHKYH . ::: :::::..: ..::: .:..:: CCDS32 LQDEFTVQMQAAWDKFLTGVAVVLTEKYR 120 130 140 147 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Tue Nov 8 10:00:13 2016 done: Tue Nov 8 10:00:14 2016 Total Scan time: 1.480 Total Display time: -0.010 Function used was FASTA [36.3.4 Apr, 2011]