FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE2766, 271 aa 1>>>pF1KE2766 271 - 271 aa - 271 aa Library: human.CCDS.faa 18921897 residues in 33420 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.3993+/-0.000661; mu= 14.6360+/- 0.040 mean_var=66.2179+/-13.474, 0's: 0 Z-trim(110.5): 7 B-trim: 0 in 0/52 Lambda= 0.157611 statistics sampled from 11840 (11846) to 11840 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.735), E-opt: 0.2 (0.354), width: 16 Scan time: 1.180 The best scores are: opt bits E(33420) CCDS10445.1 NUBP2 gene_id:10101|Hs109|chr16 ( 271) 1818 421.7 2.8e-118 CCDS66898.1 NUBP2 gene_id:10101|Hs109|chr16 ( 211) 1435 334.5 3.8e-92 CCDS10543.1 NUBP1 gene_id:4682|Hs109|chr16 ( 320) 949 224.1 1e-58 CCDS61839.1 NUBP1 gene_id:4682|Hs109|chr16 ( 309) 690 165.2 5.2e-41 CCDS41940.1 NUBPL gene_id:80224|Hs109|chr14 ( 319) 587 141.8 6e-34 >>CCDS10445.1 NUBP2 gene_id:10101|Hs109|chr16 (271 aa) initn: 1818 init1: 1818 opt: 1818 Z-score: 2237.0 bits: 421.7 E(33420): 2.8e-118 Smith-Waterman score: 1818; 100.0% identity (100.0% similar) in 271 aa overlap (1-271:1-271) 10 20 30 40 50 60 pF1KE2 MEAAAEPGNLAGVRHIILVLSGKGGVGKSTISTELALALRHAGKKVGILDVDLCGPSIPR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS10 MEAAAEPGNLAGVRHIILVLSGKGGVGKSTISTELALALRHAGKKVGILDVDLCGPSIPR 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE2 MLGAQGRAVHQCDRGWAPVFLDREQSISLMSVGFLLEKPDEAVVWRGPKKNALIKQFVSD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS10 MLGAQGRAVHQCDRGWAPVFLDREQSISLMSVGFLLEKPDEAVVWRGPKKNALIKQFVSD 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE2 VAWGELDYLVVDTPPGTSDEHMATIEALRPYQPLGALVVTTPQAVSVGDVRRELTFCRKT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS10 VAWGELDYLVVDTPPGTSDEHMATIEALRPYQPLGALVVTTPQAVSVGDVRRELTFCRKT 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE2 GLRVMGIVENMSGFTCPHCTECTSVFSRGGGEELAQLAGVPFLGSVPLDPALMRTLEEGH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS10 GLRVMGIVENMSGFTCPHCTECTSVFSRGGGEELAQLAGVPFLGSVPLDPALMRTLEEGH 190 200 210 220 230 240 250 260 270 pF1KE2 DFIQEFPGSPAFAALTSIAQKILDATPACLP ::::::::::::::::::::::::::::::: CCDS10 DFIQEFPGSPAFAALTSIAQKILDATPACLP 250 260 270 >>CCDS66898.1 NUBP2 gene_id:10101|Hs109|chr16 (211 aa) initn: 1435 init1: 1435 opt: 1435 Z-score: 1768.0 bits: 334.5 E(33420): 3.8e-92 Smith-Waterman score: 1435; 100.0% identity (100.0% similar) in 211 aa overlap (61-271:1-211) 40 50 60 70 80 90 pF1KE2 ISTELALALRHAGKKVGILDVDLCGPSIPRMLGAQGRAVHQCDRGWAPVFLDREQSISLM :::::::::::::::::::::::::::::: CCDS66 MLGAQGRAVHQCDRGWAPVFLDREQSISLM 10 20 30 100 110 120 130 140 150 pF1KE2 SVGFLLEKPDEAVVWRGPKKNALIKQFVSDVAWGELDYLVVDTPPGTSDEHMATIEALRP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS66 SVGFLLEKPDEAVVWRGPKKNALIKQFVSDVAWGELDYLVVDTPPGTSDEHMATIEALRP 40 50 60 70 80 90 160 170 180 190 200 210 pF1KE2 YQPLGALVVTTPQAVSVGDVRRELTFCRKTGLRVMGIVENMSGFTCPHCTECTSVFSRGG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS66 YQPLGALVVTTPQAVSVGDVRRELTFCRKTGLRVMGIVENMSGFTCPHCTECTSVFSRGG 100 110 120 130 140 150 220 230 240 250 260 270 pF1KE2 GEELAQLAGVPFLGSVPLDPALMRTLEEGHDFIQEFPGSPAFAALTSIAQKILDATPACL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS66 GEELAQLAGVPFLGSVPLDPALMRTLEEGHDFIQEFPGSPAFAALTSIAQKILDATPACL 160 170 180 190 200 210 pF1KE2 P : CCDS66 P >>CCDS10543.1 NUBP1 gene_id:4682|Hs109|chr16 (320 aa) initn: 804 init1: 542 opt: 949 Z-score: 1168.0 bits: 224.1 E(33420): 1e-58 Smith-Waterman score: 949; 53.0% identity (80.6% similar) in 253 aa overlap (13-262:53-303) 10 20 30 40 pF1KE2 MEAAAEPGNLAGVRHIILVLSGKGGVGKSTISTELALAL-RH :.: ::::::::::::::.:..:: .: . CCDS10 QGCPNQRLCASGAGATPDTAIEEIKEKMKTVKHKILVLSGKGGVGKSTFSAHLAHGLAED 30 40 50 60 70 80 50 60 70 80 90 100 pF1KE2 AGKKVGILDVDLCGPSIPRMLGAQGRAVHQCDRGWAPVFLDREQSISLMSVGFLLEKPDE . ....::.:.::::::...: .:. ::: ::.::.. :.....::::::: .::. CCDS10 ENTQIALLDIDICGPSIPKIMGLEGEQVHQSGSGWSPVYV--EDNLGVMSVGFLLSSPDD 90 100 110 120 130 140 110 120 130 140 150 160 pF1KE2 AVVWRGPKKNALIKQFVSDVAWGELDYLVVDTPPGTSDEHMATIEALRPYQPLGALVVTT ::.:::::::..::::. :: :::.:::.:::::::::::..... : . ::...:: CCDS10 AVIWRGPKKNGMIKQFLRDVDWGEVDYLIVDTPPGTSDEHLSVVRYLATAHIDGAVIITT 150 160 170 180 190 200 170 180 190 200 210 pF1KE2 PQAVSVGDVRRELTFCRKTGLRVMGIVENMSGFTCPHCTECTSVF--SRGGGEELAQLAG :: ::. :::.:..::::. : ..:.::::::: ::.: . ...: . ::.: . : CCDS10 PQEVSLQDVRKEINFCRKVKLPIIGVVENMSGFICPKCKKESQIFPPTTGGAELMCQDLE 210 220 230 240 250 260 220 230 240 250 260 270 pF1KE2 VPFLGSVPLDPALMRTLEEGHDFIQEFPGSPAFAALTSIAQKILDATPACLP ::.:: ::::: . .. ..:..:. . : ::: : :: :.: CCDS10 VPLLGRVPLDPLIGKNCDKGQSFFIDAPDSPATLAYRSIIQRIQEFCNLHQSKEENLISS 270 280 290 300 310 320 >>CCDS61839.1 NUBP1 gene_id:4682|Hs109|chr16 (309 aa) initn: 854 init1: 542 opt: 690 Z-score: 850.0 bits: 165.2 E(33420): 5.2e-41 Smith-Waterman score: 860; 50.2% identity (76.7% similar) in 253 aa overlap (13-262:53-292) 10 20 30 40 pF1KE2 MEAAAEPGNLAGVRHIILVLSGKGGVGKSTISTELALAL-RH :.: ::::::::::::::.:..:: .: . CCDS61 QGCPNQRLCASGAGATPDTAIEEIKEKMKTVKHKILVLSGKGGVGKSTFSAHLAHGLAED 30 40 50 60 70 80 50 60 70 80 90 100 pF1KE2 AGKKVGILDVDLCGPSIPRMLGAQGRAVHQCDRGWAPVFLDREQSISLMSVGFLLEKPDE . ....::.:.::::::...: .:. :.....::::::: .::. CCDS61 ENTQIALLDIDICGPSIPKIMGLEGEQY-------------VEDNLGVMSVGFLLSSPDD 90 100 110 120 110 120 130 140 150 160 pF1KE2 AVVWRGPKKNALIKQFVSDVAWGELDYLVVDTPPGTSDEHMATIEALRPYQPLGALVVTT ::.:::::::..::::. :: :::.:::.:::::::::::..... : . ::...:: CCDS61 AVIWRGPKKNGMIKQFLRDVDWGEVDYLIVDTPPGTSDEHLSVVRYLATAHIDGAVIITT 130 140 150 160 170 180 170 180 190 200 210 pF1KE2 PQAVSVGDVRRELTFCRKTGLRVMGIVENMSGFTCPHCTECTSVF--SRGGGEELAQLAG :: ::. :::.:..::::. : ..:.::::::: ::.: . ...: . ::.: . : CCDS61 PQEVSLQDVRKEINFCRKVKLPIIGVVENMSGFICPKCKKESQIFPPTTGGAELMCQDLE 190 200 210 220 230 240 220 230 240 250 260 270 pF1KE2 VPFLGSVPLDPALMRTLEEGHDFIQEFPGSPAFAALTSIAQKILDATPACLP ::.:: ::::: . .. ..:..:. . : ::: : :: :.: CCDS61 VPLLGRVPLDPLIGKNCDKGQSFFIDAPDSPATLAYRSIIQRIQEFCNLHQSKEENLISS 250 260 270 280 290 300 >>CCDS41940.1 NUBPL gene_id:80224|Hs109|chr14 (319 aa) initn: 607 init1: 227 opt: 587 Z-score: 723.2 bits: 141.8 E(33420): 6e-34 Smith-Waterman score: 587; 37.4% identity (68.3% similar) in 262 aa overlap (10-268:63-316) 10 20 30 pF1KE2 MEAAAEPGNLAGVRHIILVLSGKGGVGKSTISTELALAL . ::...:.: :::::::::: ...::::: CCDS41 CGRQLSGAGSETLKQRRTQIMSRGLPKQKPIEGVKQVIVVASGKGGVGKSTTAVNLALAL 40 50 60 70 80 90 40 50 60 70 80 90 pF1KE2 --RHAGKKVGILDVDLCGPSIPRMLGAQGRAVHQCDRGWAPVFLDREQSISLMSVGFLLE ..: .:.::::. :::.:.:.. .: . . :.. . .:. ::.:::.: CCDS41 AANDSSKAIGLLDVDVYGPSVPKMMNLKGNPELSQSNLMRPLL---NYGIACMSMGFLVE 100 110 120 130 140 100 110 120 130 140 150 pF1KE2 KPDEAVVWRGPKKNALIKQFVSDVAWGELDYLVVDTPPGTSDEHMATIEALRPYQPL-GA . .: ::::: . :.... .: ::.::::::: ::::.: .... . . :. :: CCDS41 E-SEPVVWRGLMVMSAIEKLLRQVDWGQLDYLVVDMPPGTGDVQLSVSQNI----PITGA 150 160 170 180 190 200 160 170 180 190 200 210 pF1KE2 LVVTTPQAVSVGDVRRELTFCRKTGLRVMGIVENMSGFTCPHCTECTSVFSRGGGEELAQ ..:.::: ... :... . :.. . :.:.:.::: : ::.: . : .:. :...::: CCDS41 VIVSTPQDIALMDAHKGAEMFRRVHVPVLGLVQNMSVFQCPKCKHKTHIFGADGARKLAQ 210 220 230 240 250 260 220 230 240 250 260 270 pF1KE2 LAGVPFLGSVPLDPALMRTLEEGHDFIQEFPGSPAFAALTSIAQKILDATPACLP :. ::..:: . .. . :. .. : : : :: ... :. CCDS41 TLGLEVLGDIPLHLNIREASDTGQPIVFSQPESDEAKAYLRIAVEVVRRLPSPSE 270 280 290 300 310 271 residues in 1 query sequences 18921897 residues in 33420 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Fri Jul 3 20:33:54 2020 done: Fri Jul 3 20:33:54 2020 Total Scan time: 1.180 Total Display time: 0.030 Function used was FASTA [36.3.4 Apr, 2011]