FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE5116, 147 aa 1>>>pF1KE5116 147 - 147 aa - 147 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.6208+/-0.000776; mu= 9.1705+/- 0.046 mean_var=56.6577+/-11.109, 0's: 0 Z-trim(107.4): 16 B-trim: 0 in 0/51 Lambda= 0.170390 statistics sampled from 9538 (9551) to 9538 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.693), E-opt: 0.2 (0.293), width: 16 Scan time: 1.550 The best scores are: opt bits E(32554) CCDS7756.1 HBE1 gene_id:3046|Hs108|chr11 ( 147) 968 245.8 7.1e-66 CCDS7755.1 HBG2 gene_id:3048|Hs108|chr11 ( 147) 810 207.0 3.5e-54 CCDS7754.1 HBG1 gene_id:3047|Hs108|chr11 ( 147) 809 206.7 4.2e-54 CCDS7753.1 HBB gene_id:3043|Hs108|chr11 ( 147) 774 198.1 1.6e-51 CCDS31376.1 HBD gene_id:3045|Hs108|chr11 ( 147) 754 193.2 4.9e-50 CCDS10397.1 HBZ gene_id:3050|Hs108|chr16 ( 142) 357 95.6 1.1e-20 CCDS32347.1 HBM gene_id:3042|Hs108|chr16 ( 141) 332 89.5 7.9e-19 CCDS10399.1 HBA1 gene_id:3039|Hs108|chr16 ( 142) 332 89.5 8e-19 CCDS10398.1 HBA2 gene_id:3040|Hs108|chr16 ( 142) 332 89.5 8e-19 CCDS10400.1 HBQ1 gene_id:3049|Hs108|chr16 ( 142) 318 86.0 8.7e-18 CCDS11746.1 CYGB gene_id:114757|Hs108|chr17 ( 190) 259 71.5 2.7e-13 >>CCDS7756.1 HBE1 gene_id:3046|Hs108|chr11 (147 aa) initn: 968 init1: 968 opt: 968 Z-score: 1296.1 bits: 245.8 E(32554): 7.1e-66 Smith-Waterman score: 968; 100.0% identity (100.0% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE5 MVHFTAEEKAAVTSLWSKMNVEEAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGNPK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS77 MVHFTAEEKAAVTSLWSKMNVEEAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 VKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATHFG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS77 VKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATHFG 70 80 90 100 110 120 130 140 pF1KE5 KEFTPEVQAAWQKLVSAVAIALAHKYH ::::::::::::::::::::::::::: CCDS77 KEFTPEVQAAWQKLVSAVAIALAHKYH 130 140 >>CCDS7755.1 HBG2 gene_id:3048|Hs108|chr11 (147 aa) initn: 810 init1: 810 opt: 810 Z-score: 1086.2 bits: 207.0 E(32554): 3.5e-54 Smith-Waterman score: 810; 79.6% identity (94.6% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE5 MVHFTAEEKAAVTSLWSKMNVEEAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGNPK : ::: :.::..::::.:.:::.::::.::::::::::::::::::::::: :::.:::: CCDS77 MGHFTEEDKATITSLWGKVNVEDAGGETLGRLLVVYPWTQRFFDSFGNLSSASAIMGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 VKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATHFG :::::::::::.:::::..:.:: .::.::::::::::::::::::::::.: .:: ::: CCDS77 VKAHGKKVLTSLGDAIKHLDDLKGTFAQLSELHCDKLHVDPENFKLLGNVLVTVLAIHFG 70 80 90 100 110 120 130 140 pF1KE5 KEFTPEVQAAWQKLVSAVAIALAHKYH :::::::::.:::.:..:: ::. .:: CCDS77 KEFTPEVQASWQKMVTGVASALSSRYH 130 140 >>CCDS7754.1 HBG1 gene_id:3047|Hs108|chr11 (147 aa) initn: 809 init1: 809 opt: 809 Z-score: 1084.9 bits: 206.7 E(32554): 4.2e-54 Smith-Waterman score: 809; 79.6% identity (93.9% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE5 MVHFTAEEKAAVTSLWSKMNVEEAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGNPK : ::: :.::..::::.:.:::.::::.::::::::::::::::::::::: :::.:::: CCDS77 MGHFTEEDKATITSLWGKVNVEDAGGETLGRLLVVYPWTQRFFDSFGNLSSASAIMGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 VKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATHFG :::::::::::.::: :..:.:: .::.::::::::::::::::::::::.: .:: ::: CCDS77 VKAHGKKVLTSLGDATKHLDDLKGTFAQLSELHCDKLHVDPENFKLLGNVLVTVLAIHFG 70 80 90 100 110 120 130 140 pF1KE5 KEFTPEVQAAWQKLVSAVAIALAHKYH :::::::::.:::.:.::: ::. .:: CCDS77 KEFTPEVQASWQKMVTAVASALSSRYH 130 140 >>CCDS7753.1 HBB gene_id:3043|Hs108|chr11 (147 aa) initn: 774 init1: 774 opt: 774 Z-score: 1038.4 bits: 198.1 E(32554): 1.6e-51 Smith-Waterman score: 774; 75.5% identity (93.9% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE5 MVHFTAEEKAAVTSLWSKMNVEEAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGNPK :::.: :::.:::.::.:.::.:.:::::::::::::::::::.:::.::.:.:..:::: CCDS77 MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 VKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATHFG ::::::::: .:.:.. ..:::: .:: ::::::::::::::::.:::::.: .:: ::: CCDS77 VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFG 70 80 90 100 110 120 130 140 pF1KE5 KEFTPEVQAAWQKLVSAVAIALAHKYH ::::: ::::.::.:..:: ::::::: CCDS77 KEFTPPVQAAYQKVVAGVANALAHKYH 130 140 >>CCDS31376.1 HBD gene_id:3045|Hs108|chr11 (147 aa) initn: 754 init1: 754 opt: 754 Z-score: 1011.8 bits: 193.2 E(32554): 4.9e-50 Smith-Waterman score: 754; 72.8% identity (94.6% similar) in 147 aa overlap (1-147:1-147) 10 20 30 40 50 60 pF1KE5 MVHFTAEEKAAVTSLWSKMNVEEAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGNPK :::.: :::.::..::.:.::. .:::::::::::::::::::.:::.::::.:..:::: CCDS31 MVHLTPEEKTAVNALWGKVNVDAVGGEALGRLLVVYPWTQRFFESFGDLSSPDAVMGNPK 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE5 VKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATHFG ::::::::: .:.:.. ..:::: .:..::::::::::::::::.:::::.: .:: .:: CCDS31 VKAHGKKVLGAFSDGLAHLDNLKGTFSQLSELHCDKLHVDPENFRLLGNVLVCVLARNFG 70 80 90 100 110 120 130 140 pF1KE5 KEFTPEVQAAWQKLVSAVAIALAHKYH :::::..:::.::.:..:: ::::::: CCDS31 KEFTPQMQAAYQKVVAGVANALAHKYH 130 140 >>CCDS10397.1 HBZ gene_id:3050|Hs108|chr16 (142 aa) initn: 287 init1: 287 opt: 357 Z-score: 484.7 bits: 95.6 E(32554): 1.1e-20 Smith-Waterman score: 357; 40.0% identity (75.9% similar) in 145 aa overlap (4-146:3-141) 10 20 30 40 50 pF1KE5 MVHFTAEEKAAVTSLWSKMNVEE--AGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGN .: :.. ..:.:.:.... : :.: ::.. .: :. .: : .: : :. CCDS10 MSLTKTERTIIVSMWAKISTQADTIGTETLERLFLSHPQTKTYFPHF-DLH-P----GS 10 20 30 40 50 60 70 80 90 100 110 pF1KE5 PKVKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATH ...:::.::... :::.:..:.. :..:::::: :.::: :::::.. ... ::.. CCDS10 AQLRAHGSKVVAAVGDAVKSIDDIGGALSKLSELHAYILRVDPVNFKLLSHCLLVTLAAR 60 70 80 90 100 110 120 130 140 pF1KE5 FGKEFTPEVQAAWQKLVSAVAIALAHKYH : .:: :..:::.:..:.:. .:..:: CCDS10 FPADFTAEAHAAWDKFLSVVSSVLTEKYR 120 130 140 >>CCDS32347.1 HBM gene_id:3042|Hs108|chr16 (141 aa) initn: 293 init1: 259 opt: 332 Z-score: 451.5 bits: 89.5 E(32554): 7.9e-19 Smith-Waterman score: 332; 36.6% identity (74.5% similar) in 145 aa overlap (4-146:2-140) 10 20 30 40 50 pF1KE5 MVHFTAEEKAAVTSLWSKMNVEEA--GGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGN ..:.:.: ....:. . .:: :.: : ::..::: :. .: .. .. . .: CCDS32 MLSAQERAQIAQVWDLIAGHEAQFGAELLLRLFTVYPSTKVYFPHLSACQDATQLL-- 10 20 30 40 50 60 70 80 90 100 110 pF1KE5 PKVKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATH .::...:.. : :....:::. :.. :..:: :.::: :: :: . . ..::.: CCDS32 ----SHGQRMLAAVGAAVQHVDNLRAALSPLADLHALVLRVDPANFPLLIQCFHVVLASH 60 70 80 90 100 110 120 130 140 pF1KE5 FGKEFTPEVQAAWQKLVSAVAIALAHKYH . ::: ..::::.:....::..:..:: CCDS32 LQDEFTVQMQAAWDKFLTGVAVVLTEKYR 120 130 140 >>CCDS10399.1 HBA1 gene_id:3039|Hs108|chr16 (142 aa) initn: 264 init1: 264 opt: 332 Z-score: 451.4 bits: 89.5 E(32554): 8e-19 Smith-Waterman score: 332; 37.9% identity (72.4% similar) in 145 aa overlap (4-146:3-141) 10 20 30 40 50 pF1KE5 MVHFTAEEKAAVTSLWSKMNVE--EAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGN .. .:. : . :.:.... : :.::: :... .: :. .: : .:: :. CCDS10 MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF-DLSH-----GS 10 20 30 40 50 60 70 80 90 100 110 pF1KE5 PKVKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATH .::.::::: .. .:. ..:.. :.. ::.:: ::.::: :::::.. ... ::.: CCDS10 AQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAH 60 70 80 90 100 110 120 130 140 pF1KE5 FGKEFTPEVQAAWQKLVSAVAIALAHKYH . :::: :.:. .:....:. .:. :: CCDS10 LPAEFTPAVHASLDKFLASVSTVLTSKYR 120 130 140 >>CCDS10398.1 HBA2 gene_id:3040|Hs108|chr16 (142 aa) initn: 264 init1: 264 opt: 332 Z-score: 451.4 bits: 89.5 E(32554): 8e-19 Smith-Waterman score: 332; 37.9% identity (72.4% similar) in 145 aa overlap (4-146:3-141) 10 20 30 40 50 pF1KE5 MVHFTAEEKAAVTSLWSKMNVE--EAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGN .. .:. : . :.:.... : :.::: :... .: :. .: : .:: :. CCDS10 MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHF-DLSH-----GS 10 20 30 40 50 60 70 80 90 100 110 pF1KE5 PKVKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATH .::.::::: .. .:. ..:.. :.. ::.:: ::.::: :::::.. ... ::.: CCDS10 AQVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAH 60 70 80 90 100 110 120 130 140 pF1KE5 FGKEFTPEVQAAWQKLVSAVAIALAHKYH . :::: :.:. .:....:. .:. :: CCDS10 LPAEFTPAVHASLDKFLASVSTVLTSKYR 120 130 140 >>CCDS10400.1 HBQ1 gene_id:3049|Hs108|chr16 (142 aa) initn: 306 init1: 228 opt: 318 Z-score: 432.8 bits: 86.0 E(32554): 8.7e-18 Smith-Waterman score: 318; 38.6% identity (70.3% similar) in 145 aa overlap (4-146:3-141) 10 20 30 40 50 pF1KE5 MVHFTAEEKAAVTSLWSKM--NVEEAGGEALGRLLVVYPWTQRFFDSFGNLSSPSAILGN ..::..: : .::.:. :: ::: : ....: :. .: : .:: : :. CCDS10 MALSAEDRALVRALWKKLGSNVGVYTTEALERTFLAFPATKTYF-SHLDLS-P----GS 10 20 30 40 50 60 70 80 90 100 110 pF1KE5 PKVKAHGKKVLTSFGDAIKNMDNLKPAFAKLSELHCDKLHVDPENFKLLGNVMVIILATH .:.:::.:: ... :.. .:.: :.. ::.:: .:.::: .:.:::. ... :: : CCDS10 SQVRAHGQKVADALSLAVERLDDLPHALSALSHLHACQLRVDPASFQLLGHCLLVTLARH 60 70 80 90 100 110 120 130 140 pF1KE5 FGKEFTPEVQAAWQKLVSAVAIALAHKYH . .:.: .::. .:..: : ::. .: CCDS10 YPGDFSPALQASLDKFLSHVISALVSEYR 120 130 140 147 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Mon Nov 7 21:29:42 2016 done: Mon Nov 7 21:29:42 2016 Total Scan time: 1.550 Total Display time: -0.010 Function used was FASTA [36.3.4 Apr, 2011]