FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KB6569, 375 aa
1>>>pF1KB6569 375 - 375 aa - 375 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 10.0427+/-0.00116; mu= 1.0036+/- 0.071
mean_var=476.4702+/-96.721, 0's: 0 Z-trim(115.2): 141 B-trim: 0 in 0/54
Lambda= 0.058757
statistics sampled from 15572 (15712) to 15572 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.771), E-opt: 0.2 (0.483), width: 16
Scan time: 3.340
The best scores are: opt bits E(32554)
CCDS7362.1 SFTPD gene_id:6441|Hs108|chr10 ( 375) 2603 234.5 1.2e-61
CCDS11561.1 COL1A1 gene_id:1277|Hs108|chr17 (1464) 660 70.6 9.9e-12
CCDS2297.1 COL3A1 gene_id:1281|Hs108|chr2 (1466) 638 68.7 3.6e-11
CCDS4436.1 COL23A1 gene_id:91522|Hs108|chr5 ( 540) 625 67.1 4.3e-11
CCDS12222.1 COL5A3 gene_id:50509|Hs108|chr19 (1745) 633 68.4 5.4e-11
CCDS75932.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 630 68.2 6.6e-11
CCDS6982.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 630 68.2 6.6e-11
>>CCDS7362.1 SFTPD gene_id:6441|Hs108|chr10 (375 aa)
initn: 2603 init1: 2603 opt: 2603 Z-score: 1220.4 bits: 234.5 E(32554): 1.2e-61
Smith-Waterman score: 2603; 100.0% identity (100.0% similar) in 375 aa overlap (1-375:1-375)
10 20 30 40 50 60
pF1KB6 MLLFLLSALVLLTQPLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPR
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS73 MLLFLLSALVLLTQPLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPR
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB6 GEKGDPGLPGAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGRE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS73 GEKGDPGLPGAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGRE
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB6 GPLGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNT
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS73 GPLGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNT
130 140 150 160 170 180
190 200 210 220 230 240
pF1KB6 GAAGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQH
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS73 GAAGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQH
190 200 210 220 230 240
250 260 270 280 290 300
pF1KB6 LQAAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAAL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS73 LQAAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAAL
250 260 270 280 290 300
310 320 330 340 350 360
pF1KB6 QQLVVAKNEAAFLSMTDSKTEGKFTYPTGESLVYSNWAPGEPNDDGGSEDCVEIFTNGKW
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS73 QQLVVAKNEAAFLSMTDSKTEGKFTYPTGESLVYSNWAPGEPNDDGGSEDCVEIFTNGKW
310 320 330 340 350 360
370
pF1KB6 NDRACGEKRLVVCEF
:::::::::::::::
CCDS73 NDRACGEKRLVVCEF
370
>>CCDS11561.1 COL1A1 gene_id:1277|Hs108|chr17 (1464 aa)
initn: 2177 init1: 618 opt: 660 Z-score: 324.0 bits: 70.6 E(32554): 9.9e-12
Smith-Waterman score: 660; 51.3% identity (60.3% similar) in 199 aa overlap (44-236:531-728)
20 30 40 50 60 70
pF1KB6 QPLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRGEKGDPGLPGAAG
:.:::: : : : : : : :: ::
CCDS11 VAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAG
510 520 530 540 550 560
80 90 100 110 120 130
pF1KB6 QAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGPLGKQGNIGPQG
: : :: :: : .:. : .: ::::: .: : : ::::: : :: ::.:. : ::
CCDS11 QDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQG
570 580 590 600 610 620
140 150 160 170 180 190
pF1KB6 KPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGAAGSAGAMGPQG
::: : :: .:: : : : : : ::: :: : :::.::::. :: : .:: : .:
CCDS11 PPGPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERG
630 640 650 660 670 680
200 210 220 230 240
pF1KB6 SPGARG---PPGLKGDKGI---PGDKGAKGESGLPDVASLRQQVEALQGQVQHLQAAFSQ
:: :: ::: : .: ::. ::::..: : : : . .:::
CCDS11 FPGERGVQGPPGPAGPRGANGAPGNDGAKGDAGAPG-APGSQGAPGLQGMPGERGAAGLP
690 700 710 720 730
250 260 270 280 290 300
pF1KB6 YKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQLVVAK
CCDS11 GPKGDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAP
740 750 760 770 780 790
>>CCDS2297.1 COL3A1 gene_id:1281|Hs108|chr2 (1466 aa)
initn: 3316 init1: 629 opt: 638 Z-score: 314.0 bits: 68.7 E(32554): 3.6e-11
Smith-Waterman score: 638; 49.7% identity (62.3% similar) in 183 aa overlap (46-222:531-713)
20 30 40 50 60 70
pF1KB6 LGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRGEKGDPGLPGAAGQA
: :: : : : : : :: ::. :..
CCDS22 GEKGPAGERGAPGPAGPRGAAGEPGRDGVPGGPGMRGMPGSPGGPGSDGKPGPPGSQGES
510 520 530 540 550 560
80 90 100 110 120 130
pF1KB6 GMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGPLGKQGNIGPQGKP
: :: :: ::.:. : .: :::::. : : : : :: : .:: ::.:. :::: :
CCDS22 GRPGPPGPSGPRGQPGVMGFPGPKGNDGAPGKNGERGGPGGPGPQGPPGKNGETGPQGPP
570 580 590 600 610 620
140 150 160 170 180 190
pF1KB6 GPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGAAGSAGAMGPQGSP
:: : .: ::..: :: :: : : .:: :: : ::: : :..:: :. :. : :.:
CCDS22 GPTGPGGDKGDTGPPGPQGLQGLPGTGGPPGENGKPGEPGPKGDAGAPGAPGGKGDAGAP
630 640 650 660 670 680
200 210 220 230 240
pF1KB6 GARGPPGLKGDKGI------PGDKGAKGESGLPDVASLRQQVEALQGQVQHLQAAFSQYK
: :::::: : :. :: .:.:: .: :
CCDS22 GERGPPGLAGAPGLRGGAGPPGPEGGKGAAGPPGPPGAAGTPGLQGMPGERGGLGSPGPK
690 700 710 720 730 740
250 260 270 280 290 300
pF1KB6 KVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQLVVAKNE
CCDS22 GDKGEPGGPGADGVPGKDGPRGPTGPIGPPGPAGQPGDKGEGGAPGLPGIAGPRGSPGER
750 760 770 780 790 800
>>CCDS4436.1 COL23A1 gene_id:91522|Hs108|chr5 (540 aa)
initn: 1058 init1: 581 opt: 625 Z-score: 312.6 bits: 67.1 E(32554): 4.3e-11
Smith-Waterman score: 625; 46.6% identity (63.2% similar) in 193 aa overlap (44-236:134-321)
20 30 40 50 60 70
pF1KB6 QPLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRGEKGDPGLPGAAG
. : ::..:::: :: : : ::::: :
CCDS44 AKIRTAREAPSECVCPPGPPGRRGKPGRRGDPGPPGQSGRDGYPGPLGLDGKPGLPGPKG
110 120 130 140 150 160
80 90 100 110 120 130
pF1KB6 QAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGPLGKQGNIGPQG
. : ::. :: : .:..:..: ::: : : :::: : :: : .:: : .:. : .:
CCDS44 EKGAPGDFGPRGDQGQDGAAGPPGPPGPPGARGPPGDTGKDGPRGAQGPAGPKGEPGQDG
170 180 190 200 210 220
140 150 160 170 180 190
pF1KB6 KPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGAAGSAGAMGPQG
. :::: ::::: :.:: .:. :. . :: : .: :: : : : : :: ::.:
CCDS44 EMGPKGPPGPKGEPGVPGKKGDDGTPSQPGPPGPKGEPGSMG-P--RGENGVDGAPGPKG
230 240 250 260 270 280
200 210 220 230 240 250
pF1KB6 SPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQHLQAAFSQYKKVEL
:: :: : : .: :: :: .:.. . : . . ..::.:
CCDS44 EPGHRGTDGAAGPRGAPGLKGEQGDTVVIDYDG--RILDALKGPPGPQGPPGPPGIPGAK
290 300 310 320 330
260 270 280 290 300 310
pF1KB6 FPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQLVVAKNEAAFL
CCDS44 GELGLPGAPGIDGEKGPKGQKGDPGEPGPAGLKGEAGEMGLSGLPGADGLKGEKGESASD
340 350 360 370 380 390
>>CCDS12222.1 COL5A3 gene_id:50509|Hs108|chr19 (1745 aa)
initn: 616 init1: 616 opt: 633 Z-score: 310.9 bits: 68.4 E(32554): 5.4e-11
Smith-Waterman score: 633; 48.0% identity (61.6% similar) in 198 aa overlap (46-237:530-726)
20 30 40 50 60
pF1KB6 LGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRGEKGD------PGLP
: :: :: : : : ::: ::::
CCDS12 GLKGEEGAEGPQGPRGLQGPHGPPGRVGKMGRPGADGARGLPGDTGPKGDRGFDGLPGLP
500 510 520 530 540 550
70 80 90 100 110 120
pF1KB6 GAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGPLGKQGNI
: :: : :..: :: :..: : :: : :: .: ::: :. :: : :: :. :
CCDS12 GEKGQRGDFGHVGQPGPPGEDGERGAEGPPGPTGQAGEPGPRGLLGPRGSPGPTGRPGVT
560 570 580 590 600 610
130 140 150 160 170 180
pF1KB6 GPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGAAGSAGAM
: .: :: ::..:: :: : ::.::. :..:: ::.: :.:::.: ::: : : :.
CCDS12 GIDGAPGAKGNVGPPGEPGPPGQQGNHGSQGLPGPQGLIGTPGEKGPPGNPGIPGLPGSD
620 630 640 650 660 670
190 200 210 220 230 240
pF1KB6 GPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQHLQAAFSQYK
:: : :: .:: : :: .: ::. : : : : .. . ..:::.
CCDS12 GPLGHPGHEGPTGEKGAQGPPGSAGPPGYPG-PRGVKGTSGNRGLQGEKGEKGEDGFPGF
680 690 700 710 720 730
250 260 270 280 290 300
pF1KB6 KVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQLVVAKNE
CCDS12 KGDVGLKGDQGKPGAPGPRGEDGPEGPKGQAGQAGEEGPPGSAGEKGKLGVPGLPGYPGR
740 750 760 770 780 790
>>CCDS75932.1 COL5A1 gene_id:1289|Hs108|chr9 (1838 aa)
initn: 622 init1: 622 opt: 630 Z-score: 309.3 bits: 68.2 E(32554): 6.6e-11
Smith-Waterman score: 634; 47.5% identity (61.3% similar) in 204 aa overlap (45-236:603-805)
20 30 40 50 60
pF1KB6 PLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRG------EKGD---
.: ::: :: : .: :: :::
CCDS75 VGPPGSGGLKGEPGDVGPQGPRGVQGPPGPAGKPGRRGRAGSDGARGMPGQTGPKGDRGF
580 590 600 610 620 630
70 80 90 100 110 120
pF1KB6 ---PGLPGAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGP
:::: :. : :: .:: :: ::.: :. : : : : ::: :. :: : ::
CCDS75 DGLAGLPGEKGHRGDPGPSGPPGPPGDDGERGDDGEVGPRGLPGEPGPRGLLGPKGPPGP
640 650 660 670 680 690
130 140 150 160 170 180
pF1KB6 LGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGA
: : : .:.:::::..::.:: : ::.::. ::.:: ::.: : :::.: :. :
CCDS75 PGPPGVTGMDGQPGPKGNVGPQGEPGPPGQQGNPGAQGLPGPQGAIGPPGEKGPLGKPGL
700 710 720 730 740 750
190 200 210 220 230 240
pF1KB6 AGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQHLQ
: :: :: : :: .:::: :: .: :: .: : : : .. . ...:.:
CCDS75 PGMPGADGPPGHPGKEGPPGEKGGQGPPGPQGPIGYPG-PRGVKGADGIRGLKGTKGEKG
760 770 780 790 800 810
250 260 270 280 290 300
pF1KB6 AAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQ
CCDS75 EDGFPGFKGDMGIKGDRGEIGPPGPRGEDGPEGPKGRGGPNGDPGPLGPPGEKGKLGVPG
820 830 840 850 860 870
>>CCDS6982.1 COL5A1 gene_id:1289|Hs108|chr9 (1838 aa)
initn: 622 init1: 622 opt: 630 Z-score: 309.3 bits: 68.2 E(32554): 6.6e-11
Smith-Waterman score: 634; 47.5% identity (61.3% similar) in 204 aa overlap (45-236:603-805)
20 30 40 50 60
pF1KB6 PLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRG------EKGD---
.: ::: :: : .: :: :::
CCDS69 VGPPGSGGLKGEPGDVGPQGPRGVQGPPGPAGKPGRRGRAGSDGARGMPGQTGPKGDRGF
580 590 600 610 620 630
70 80 90 100 110 120
pF1KB6 ---PGLPGAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGP
:::: :. : :: .:: :: ::.: :. : : : : ::: :. :: : ::
CCDS69 DGLAGLPGEKGHRGDPGPSGPPGPPGDDGERGDDGEVGPRGLPGEPGPRGLLGPKGPPGP
640 650 660 670 680 690
130 140 150 160 170 180
pF1KB6 LGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGA
: : : .:.:::::..::.:: : ::.::. ::.:: ::.: : :::.: :. :
CCDS69 PGPPGVTGMDGQPGPKGNVGPQGEPGPPGQQGNPGAQGLPGPQGAIGPPGEKGPLGKPGL
700 710 720 730 740 750
190 200 210 220 230 240
pF1KB6 AGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQHLQ
: :: :: : :: .:::: :: .: :: .: : : : .. . ...:.:
CCDS69 PGMPGADGPPGHPGKEGPPGEKGGQGPPGPQGPIGYPG-PRGVKGADGIRGLKGTKGEKG
760 770 780 790 800 810
250 260 270 280 290 300
pF1KB6 AAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQ
CCDS69 EDGFPGFKGDMGIKGDRGEIGPPGPRGEDGPEGPKGRGGPNGDPGPLGPPGEKGKLGVPG
820 830 840 850 860 870
375 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Fri Nov 4 15:47:21 2016 done: Fri Nov 4 15:47:22 2016
Total Scan time: 3.340 Total Display time: 0.010
Function used was FASTA [36.3.4 Apr, 2011]