FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE1222, 245 aa 1>>>pF1KE1222 245 - 245 aa - 245 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 7.8611+/-0.000894; mu= 7.7825+/- 0.055 mean_var=288.3096+/-57.571, 0's: 0 Z-trim(116.4): 121 B-trim: 0 in 0/54 Lambda= 0.075534 statistics sampled from 16875 (16997) to 16875 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.821), E-opt: 0.2 (0.522), width: 16 Scan time: 2.470 The best scores are: opt bits E(32554) CCDS44445.1 SFTPA1 gene_id:653509|Hs108|chr10 ( 248) 1757 203.6 1e-52 CCDS44444.2 SFTPA1 gene_id:653509|Hs108|chr10 ( 263) 1757 203.6 1.1e-52 CCDS41540.1 SFTPA2 gene_id:729238|Hs108|chr10 ( 248) 1706 198.0 4.9e-51 CCDS7362.1 SFTPD gene_id:6441|Hs108|chr10 ( 375) 520 69.0 5.1e-12 CCDS7247.1 MBL2 gene_id:4153|Hs108|chr10 ( 248) 492 65.7 3.3e-11 >>CCDS44445.1 SFTPA1 gene_id:653509|Hs108|chr10 (248 aa) initn: 1757 init1: 1757 opt: 1757 Z-score: 1059.7 bits: 203.6 E(32554): 1e-52 Smith-Waterman score: 1757; 99.6% identity (100.0% similar) in 245 aa overlap (1-245:1-245) 10 20 30 40 50 60 pF1KE1 MWLCPLALNLILMAASGAVCEVKDVCVGSPGIPGTPGSHGLPGRDGRDGVKGDPGPPGPM :::::::::::::::::::::::::::::::::::::::::::::::::.:::::::::: CCDS44 MWLCPLALNLILMAASGAVCEVKDVCVGSPGIPGTPGSHGLPGRDGRDGLKGDPGPPGPM 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 GPPGEMPCPPGNDGLPGAPGIPGECGEKGEPGERGPPGLPAHLDEELQATLHDFRHQILQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 GPPGEMPCPPGNDGLPGAPGIPGECGEKGEPGERGPPGLPAHLDEELQATLHDFRHQILQ 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 TRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIAVPRNPEENEAIASFVKK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 TRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIAVPRNPEENEAIASFVKK 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE1 YNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRGKEQCVEMYTDGQWNDRNCLY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 YNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRGKEQCVEMYTDGQWNDRNCLY 190 200 210 220 230 240 pF1KE1 SRLTI ::::: CCDS44 SRLTICEF >>CCDS44444.2 SFTPA1 gene_id:653509|Hs108|chr10 (263 aa) initn: 1757 init1: 1757 opt: 1757 Z-score: 1059.5 bits: 203.6 E(32554): 1.1e-52 Smith-Waterman score: 1757; 99.6% identity (100.0% similar) in 245 aa overlap (1-245:16-260) 10 20 30 40 pF1KE1 MWLCPLALNLILMAASGAVCEVKDVCVGSPGIPGTPGSHGLPGRD ::::::::::::::::::::::::::::::::::::::::::::: CCDS44 MRPCQVPGAATGPRAMWLCPLALNLILMAASGAVCEVKDVCVGSPGIPGTPGSHGLPGRD 10 20 30 40 50 60 50 60 70 80 90 100 pF1KE1 GRDGVKGDPGPPGPMGPPGEMPCPPGNDGLPGAPGIPGECGEKGEPGERGPPGLPAHLDE ::::.::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 GRDGLKGDPGPPGPMGPPGEMPCPPGNDGLPGAPGIPGECGEKGEPGERGPPGLPAHLDE 70 80 90 100 110 120 110 120 130 140 150 160 pF1KE1 ELQATLHDFRHQILQTRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIAVP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 ELQATLHDFRHQILQTRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIAVP 130 140 150 160 170 180 170 180 190 200 210 220 pF1KE1 RNPEENEAIASFVKKYNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRGKEQCV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 RNPEENEAIASFVKKYNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRGKEQCV 190 200 210 220 230 240 230 240 pF1KE1 EMYTDGQWNDRNCLYSRLTI :::::::::::::::::::: CCDS44 EMYTDGQWNDRNCLYSRLTICEF 250 260 >>CCDS41540.1 SFTPA2 gene_id:729238|Hs108|chr10 (248 aa) initn: 1706 init1: 1706 opt: 1706 Z-score: 1029.7 bits: 198.0 E(32554): 4.9e-51 Smith-Waterman score: 1706; 97.1% identity (98.8% similar) in 245 aa overlap (1-245:1-245) 10 20 30 40 50 60 pF1KE1 MWLCPLALNLILMAASGAVCEVKDVCVGSPGIPGTPGSHGLPGRDGRDGVKGDPGPPGPM ::::::::.:::::::::.::::::::::::::::::::::::::::::::::::::::: CCDS41 MWLCPLALTLILMAASGAACEVKDVCVGSPGIPGTPGSHGLPGRDGRDGVKGDPGPPGPM 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE1 GPPGEMPCPPGNDGLPGAPGIPGECGEKGEPGERGPPGLPAHLDEELQATLHDFRHQILQ ::::: ::::::.:::::::.::: ::::: ::::::::::::::::::::::::::::: CCDS41 GPPGETPCPPGNNGLPGAPGVPGERGEKGEAGERGPPGLPAHLDEELQATLHDFRHQILQ 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE1 TRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIAVPRNPEENEAIASFVKK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 TRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIAVPRNPEENEAIASFVKK 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE1 YNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRGKEQCVEMYTDGQWNDRNCLY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 YNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRGKEQCVEMYTDGQWNDRNCLY 190 200 210 220 230 240 pF1KE1 SRLTI ::::: CCDS41 SRLTICEF >>CCDS7362.1 SFTPD gene_id:6441|Hs108|chr10 (375 aa) initn: 663 init1: 221 opt: 520 Z-score: 329.3 bits: 69.0 E(32554): 5.1e-12 Smith-Waterman score: 520; 36.7% identity (60.2% similar) in 226 aa overlap (27-245:147-372) 10 20 30 40 50 pF1KE1 MWLCPLALNLILMAASGAVCEVKDVCVGSPGIPGTPGSHGLPGRDGRDGVKGDPGP ::.::. :. :..:: : :. :: :. : CCDS73 AGREGPLGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGV 120 130 140 150 160 170 60 70 80 90 100 110 pF1KE1 PGPMGPPGEMPC--PPGNDGLPGAPGIPGECGEKGEPGERGPPGLP--AHLDEELQATLH :: : : : :. : : ::. :. : :. : .: ::: : : ....: CCDS73 PGNTGAAGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQG 180 190 200 210 220 230 120 130 140 150 160 170 pF1KE1 DFRH--QILQTRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIAVPRNPEE . .: .. . : . ..::::.:.. : : : :..:::..: ::. : CCDS73 QVQHLQAAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAE 240 250 260 270 280 290 180 190 200 210 220 pF1KE1 NEAIASFVKKYNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRG-KEQCVEMYT : :. ..: : :....:.. . : : : : . :.:: ::: : .:.:::..: CCDS73 NAALQQLVVAKNEAAFLSMTDSKTEGKFTYPTGESLVYSNWAPGEPNDDGGSEDCVEIFT 300 310 320 330 340 350 230 240 pF1KE1 DGQWNDRNCLYSRLTI .:.:::: : .::.. CCDS73 NGKWNDRACGEKRLVVCEF 360 370 >>CCDS7247.1 MBL2 gene_id:4153|Hs108|chr10 (248 aa) initn: 447 init1: 179 opt: 492 Z-score: 314.7 bits: 65.7 E(32554): 3.3e-11 Smith-Waterman score: 509; 36.7% identity (62.1% similar) in 256 aa overlap (5-245:8-243) 10 20 30 40 50 pF1KE1 MWLCPLALNLILMAASGA---VCE-VKDVCVGSPGIPG--TPGSHGLPGRDGRDGVK :: : : ..::: . .:: .. .: :.. . .:: .:.::.:::::.: CCDS72 MSLFPSLPLLL-LSMVAASYSETVTCEDAQKTC---PAVIACSSPGINGFPGKDGRDGTK 10 20 30 40 50 60 70 80 90 100 pF1KE1 GDPGPPGP-----MGPPGEMPCPPGNDGLPGAPGIPGECGEKGEPGERGPPG---LPAHL :. : :: .::::.. :::: :: : :: :.::.:: ..: : : : CCDS72 GEKGEPGQGLRGLQGPPGKLG-PPGN---PGPSGSPGPKGQKGDPG-KSPDGDSSLAASE 60 70 80 90 100 110 110 120 130 140 150 160 pF1KE1 DEELQATLHDFRHQILQTRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIA . ::. . ... . . : ::.: : .::. .::. .. :.. . .: CCDS72 RKALQTEMARIKKWLTFSLGK--------QVGNKFFLTNGEIMTFEKVKALCVKFQASVA 120 130 140 150 160 170 180 190 200 210 220 pF1KE1 VPRNPEENEAIASFVKKYNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRGK-E .::: :: :: ...:. :..:.:. . :.: :. ..:::: .::: . :. : CCDS72 TPRNAAENGAIQNLIKEE---AFLGITDEKTEGQFVDLTGNRLTYTNWNEGEPNNAGSDE 170 180 190 200 210 220 230 240 pF1KE1 QCVEMYTDGQWNDRNCLYSRLTI .:: . .::::: : :.:.. CCDS72 DCVLLLKNGQWNDVPCSTSHLAVCEFPI 230 240 245 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sun Nov 6 21:44:06 2016 done: Sun Nov 6 21:44:06 2016 Total Scan time: 2.470 Total Display time: -0.010 Function used was FASTA [36.3.4 Apr, 2011]