FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE4559, 684 aa 1>>>pF1KE4559 684 - 684 aa - 684 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 11.6198+/-0.00133; mu= -1.8470+/- 0.081 mean_var=688.1282+/-141.183, 0's: 0 Z-trim(115.6): 215 B-trim: 0 in 0/52 Lambda= 0.048892 statistics sampled from 15978 (16189) to 15978 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.735), E-opt: 0.2 (0.497), width: 16 Scan time: 4.870 The best scores are: opt bits E(32554) CCDS13505.1 COL9A3 gene_id:1299|Hs108|chr20 ( 684) 5102 375.3 1.6e-103 CCDS450.1 COL9A2 gene_id:1298|Hs108|chr1 ( 689) 2422 186.3 1.3e-46 CCDS47447.1 COL9A1 gene_id:1297|Hs108|chr6 ( 678) 2394 184.3 4.9e-46 CCDS4971.1 COL9A1 gene_id:1297|Hs108|chr6 ( 921) 2368 182.7 2.1e-45 CCDS6376.1 COL22A1 gene_id:169044|Hs108|chr8 (1626) 1913 150.9 1.3e-35 CCDS8759.1 COL2A1 gene_id:1280|Hs108|chr12 (1418) 1851 146.5 2.5e-34 CCDS41778.1 COL2A1 gene_id:1280|Hs108|chr12 (1487) 1851 146.5 2.6e-34 CCDS75932.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 1838 145.7 5.6e-34 CCDS6982.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 1838 145.7 5.6e-34 CCDS33350.1 COL5A2 gene_id:1290|Hs108|chr2 (1499) 1833 145.2 6.3e-34 CCDS34682.1 COL1A2 gene_id:1278|Hs108|chr7 (1366) 1821 144.3 1.1e-33 CCDS11561.1 COL1A1 gene_id:1277|Hs108|chr17 (1464) 1774 141.1 1.1e-32 CCDS780.2 COL11A1 gene_id:1301|Hs108|chr1 (1690) 1773 141.1 1.3e-32 CCDS53348.1 COL11A1 gene_id:1301|Hs108|chr1 (1767) 1773 141.1 1.3e-32 CCDS778.1 COL11A1 gene_id:1301|Hs108|chr1 (1806) 1773 141.1 1.3e-32 CCDS35366.1 COL4A5 gene_id:1287|Hs108|chrX (1691) 1744 139.0 5.3e-32 CCDS6802.1 COL27A1 gene_id:85301|Hs108|chr9 (1860) 1735 138.5 8.6e-32 CCDS43452.1 COL11A2 gene_id:1302|Hs108|chr6 (1650) 1724 137.6 1.4e-31 CCDS9511.1 COL4A1 gene_id:1282|Hs108|chr13 (1669) 1722 137.5 1.5e-31 CCDS42829.1 COL4A3 gene_id:1285|Hs108|chr2 (1670) 1699 135.9 4.7e-31 CCDS2297.1 COL3A1 gene_id:1281|Hs108|chr2 (1466) 1693 135.3 5.9e-31 CCDS14543.1 COL4A5 gene_id:1287|Hs108|chrX (1685) 1694 135.5 6.1e-31 CCDS12222.1 COL5A3 gene_id:50509|Hs108|chr19 (1745) 1687 135.0 8.7e-31 CCDS41353.1 COL24A1 gene_id:255631|Hs108|chr1 (1714) 1655 132.8 4.1e-30 CCDS42828.1 COL4A4 gene_id:1286|Hs108|chr2 (1690) 1642 131.8 7.7e-30 CCDS41907.1 COL4A2 gene_id:1284|Hs108|chr13 (1712) 1607 129.4 4.3e-29 CCDS41297.1 COL16A1 gene_id:1307|Hs108|chr1 (1604) 1605 129.2 4.6e-29 CCDS14542.1 COL4A6 gene_id:1288|Hs108|chrX (1690) 1576 127.2 1.9e-28 CCDS14541.1 COL4A6 gene_id:1288|Hs108|chrX (1691) 1576 127.2 1.9e-28 CCDS76010.1 COL4A6 gene_id:1288|Hs108|chrX (1707) 1576 127.2 2e-28 CCDS55025.1 COL21A1 gene_id:81578|Hs108|chr6 ( 957) 1561 125.8 2.9e-28 CCDS83099.1 COL21A1 gene_id:81578|Hs108|chr6 ( 954) 1558 125.6 3.4e-28 CCDS76009.1 COL4A6 gene_id:1288|Hs108|chrX (1666) 1556 125.8 5.1e-28 CCDS76008.1 COL4A6 gene_id:1288|Hs108|chrX (1633) 1555 125.7 5.3e-28 CCDS2773.1 COL7A1 gene_id:1294|Hs108|chr3 (2944) 1436 117.7 2.5e-25 CCDS43258.1 COL25A1 gene_id:84570|Hs108|chr4 ( 654) 1287 106.2 1.6e-22 CCDS72756.1 COL8A2 gene_id:1296|Hs108|chr1 ( 638) 1283 105.9 1.9e-22 CCDS403.1 COL8A2 gene_id:1296|Hs108|chr1 ( 703) 1283 106.0 2e-22 CCDS7554.1 COL17A1 gene_id:1308|Hs108|chr10 (1497) 1272 105.7 5.2e-22 CCDS42971.1 COL18A1 gene_id:80781|Hs108|chr21 (1339) 1253 104.3 1.2e-21 CCDS42972.1 COL18A1 gene_id:80781|Hs108|chr21 (1519) 1253 104.3 1.3e-21 CCDS77643.1 COL18A1 gene_id:80781|Hs108|chr21 (1754) 1253 104.4 1.4e-21 CCDS43259.1 COL25A1 gene_id:84570|Hs108|chr4 ( 642) 1237 102.7 1.8e-21 CCDS58922.1 COL25A1 gene_id:84570|Hs108|chr4 ( 645) 1192 99.5 1.6e-20 CCDS44428.2 COL13A1 gene_id:1305|Hs108|chr10 ( 610) 1161 97.3 7.1e-20 CCDS4436.1 COL23A1 gene_id:91522|Hs108|chr5 ( 540) 1103 93.1 1.1e-18 CCDS2934.1 COL8A1 gene_id:1295|Hs108|chr3 ( 744) 1101 93.2 1.5e-18 CCDS4970.1 COL19A1 gene_id:1310|Hs108|chr6 (1142) 1033 88.6 5.3e-17 CCDS5105.1 COL10A1 gene_id:1300|Hs108|chr6 ( 680) 986 85.0 3.9e-16 CCDS13730.1 COL6A2 gene_id:1292|Hs108|chr21 ( 828) 926 80.9 8.2e-15 >>CCDS13505.1 COL9A3 gene_id:1299|Hs108|chr20 (684 aa) initn: 5102 init1: 5102 opt: 5102 Z-score: 1972.2 bits: 375.3 E(32554): 1.6e-103 Smith-Waterman score: 5102; 100.0% identity (100.0% similar) in 684 aa overlap (1-684:1-684) 10 20 30 40 50 60 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGLPGP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGLPGP 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE4 PGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGPPGLGGKGLP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 PGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGPPGLGGKGLP 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE4 GPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVLPEGATDLQCPSICPP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVLPEGATDLQCPSICPP 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE4 GPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGLQGPRGLRGLPGPL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGLQGPRGLRGLPGPL 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE4 GPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPGVAGPSGEP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPGVAGPSGEP 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE4 GMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGELGEA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGELGEA 310 320 330 340 350 360 370 380 390 400 410 420 pF1KE4 GPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLPGPQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLPGPQ 370 380 390 400 410 420 430 440 450 460 470 480 pF1KE4 GLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPKGTQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPKGTQ 430 440 450 460 470 480 490 500 510 520 530 540 pF1KE4 GPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQIAQL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQIAQL 490 500 510 520 530 540 550 560 570 580 590 600 pF1KE4 AAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGARGPPGYRGPTGELGDPGPRGNQGD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 AAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGARGPPGYRGPTGELGDPGPRGNQGD 550 560 570 580 590 600 610 620 630 640 650 660 pF1KE4 RGDKGAAGAGLDGPEGDQGPQGPQGVPGTSKDGQDGAPGEPGPPGDPGLPGAIGAQGTPG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 RGDKGAAGAGLDGPEGDQGPQGPQGVPGTSKDGQDGAPGEPGPPGDPGLPGAIGAQGTPG 610 620 630 640 650 660 670 680 pF1KE4 ICDTSACQGAVLGGVGEKSGSRSS :::::::::::::::::::::::: CCDS13 ICDTSACQGAVLGGVGEKSGSRSS 670 680 >>CCDS450.1 COL9A2 gene_id:1298|Hs108|chr1 (689 aa) initn: 3370 init1: 1317 opt: 2422 Z-score: 950.5 bits: 186.3 E(32554): 1.3e-46 Smith-Waterman score: 2422; 51.0% identity (65.8% similar) in 681 aa overlap (1-670:1-672) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPG---PPGPPGPPGKPGQDGIDGEAGPPGL ::. : .: ::.:: .... : :: : :: ::::::::: ::.:::::. :::: CCDS45 MAAATA-SPRSLLVLL-QVVVLALAQIRGPPGERGPPGPPGPPGVPGSDGIDGDNGPPGK 10 20 30 40 50 60 70 80 90 100 110 pF1KE4 PGPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGPPGLGGK ::::::: ::: : : : ::.::::: : ::: : :: .:. : :::::: : CCDS45 AGPPGPKGEPGKAGPDGP---DGKPGIDGLTGAKGEPGPMGIPGVKGQPGLPGPPGLPGP 60 70 80 90 100 110 120 130 140 150 160 170 pF1KE4 GLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVLP--EGATDLQCP :. :::: : : :: ::.::: : : : ::::::: ::.::.. ::..:. :: CCDS45 GFAGPPGPPGPVGLPGEIGIRGPKGDPGPDGPSGPPGPPGKPGRPGTIQGLEGSADFLCP 120 130 140 150 160 170 180 190 200 210 220 230 pF1KE4 SICPPGPPGPPGMPGFKGPTGYKG---EQGEVGKDGEKGDPGPPGPAGLPGSVGLQGPRG . :::: ::::. : :: .: .: . :. :: : ::: : : :.:: ::.: CCDS45 TNCPPGMKGPPGLQGVKGHAGKRGILGDPGHQGKPGPKGDVGASGEQGIPGP---PGPQG 180 190 200 210 220 230 240 250 260 270 280 290 pF1KE4 LRGLPGPLGPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPG .:: :: :: :. :: :..: : :: : :..: ::: : : ::: : :: .: : CCDS45 IRGYPGMAGPKGETGPHGYKGMVGAIGATGPPGEEGPRGPPGRAGEKGDEGSPGIRGPQG 240 250 260 270 280 290 300 310 320 330 340 350 pF1KE4 VAGPSGEPGMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERG ..::.: : :: .:..:.:: :.:: ::. : :: : .:: :.::. :.:: :..: CCDS45 ITGPKGATGPPGINGKDGTPGTPGMKGSAGQAGQPGSPGHQGLAGVPGQPGTKGGPGDQG 300 310 320 330 340 350 360 370 380 390 400 410 pF1KE4 RAGELGEAGPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMG . : : : :: :: :. : :: : : :. : : .:: : :: .: :: :: .: CCDS45 EPGPQGLPGFSGPPGKEGEPGPRGEIGPQGIMGQKGDQGERGPVGQPGPQGRQGPKGEQG 360 370 380 390 400 410 420 430 440 450 460 470 pF1KE4 DPGLPGPQGLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRG ::.:::::: : ::.: : .::.: : : ::::.::: : :: ::::..: :: CCDS45 PPGIPGPQGLPGVKGDKGSPGKTGPRGKVGDPGVAGLPGEKGEKGESGEPGPKGQQGVRG 420 430 440 450 460 470 480 490 500 510 520 530 pF1KE4 ELGPKGTQGPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGM : : : .: :. :::: :::::: :: : :::: :. :: :..:..:.: .. : CCDS45 EPGYPGPSGDAGAPGVQGYPGPPGPRGLAGNRGVPGQPGRQGVEGRDATDQHIVDVALKM 480 490 500 510 520 530 540 550 560 570 580 pF1KE4 ISEQIAQLAAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIG---HPGARGPPGYRGPTGEL ..::.:..:. .. : :..: :: ::::::: ::. : ::: :: :: : .:.. CCDS45 LQEQLAEVAVSAKRE-ALGAVGMMGPPGPPGPPGYPGKQGPHGHPGPRGVPGIVGAVGQI 540 550 560 570 580 590 590 600 610 620 630 640 pF1KE4 GDPGPRGNQGDRGDKGAAGAGLDGPEGDQGPQGPQGVPGTSKDGQDGAPGEPGPPGDPGL :. ::.:..:..:: : .: : : : : : : :: . .:.:: : :: ::. : CCDS45 GNTGPKGKRGEKGDPGEVGRGHPGMPGPPGIPGLPGRPGQAINGKDGDRGSPGAPGEAGR 600 610 620 630 640 650 650 660 670 680 pF1KE4 PGAIGAQGTPGICDTSACQGAVLGGVGEKSGSRSS :: : : ::.:. .:: :: CCDS45 PGLPGPVGLPGFCEPAACLGASAYASARLTEPGSIKGP 660 670 680 >>CCDS47447.1 COL9A1 gene_id:1297|Hs108|chr6 (678 aa) initn: 3341 init1: 1808 opt: 2394 Z-score: 939.9 bits: 184.3 E(32554): 4.9e-46 Smith-Waterman score: 2394; 50.8% identity (66.1% similar) in 663 aa overlap (11-667:11-663) 10 20 30 40 50 60 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGLPGP : ::::: : :: : :::::::::: :: :::::. :: : ::: CCDS47 MAWTARDRGALGLLLLGLCLCAAQRGPPGEQGPPGPPGPPGVPGIDGIDGDRGPKGPPGP 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE4 PGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGPPGLGGKGLP ::: : ::::: ::. :: ::.::::: :: :: :. :..: : :: :. :.:.: CCDS47 PGPAGEPGKPGAPGK---PGTPGADGLTGPDGSPGSIGSKGQKGEPGVPGSRGFPGRGIP 70 80 90 100 110 130 140 150 160 170 180 pF1KE4 GPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVLPEGATDLQCPSICPP :::: :..: :: .: :: : :: :::::::::: :.. : ::. ::: CCDS47 GPPGPPGTAGLPGELGRVGPVGD---PGRRGPPGPPGPPGPRGTIGFHDGDPLCPNACPP 120 130 140 150 160 170 190 200 210 220 230 240 pF1KE4 GPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGLQGPRGLRGLPGPL : : ::.::..: : ::: :: :..:.::. : : : :. : : .::::. : . CCDS47 GRSGYPGLPGMRGHKGAKGEIGEPGRQGHKGEEGDQGELGEVGAQGPPGAQGLRGITGIV 180 190 200 210 220 230 250 260 270 280 290 300 pF1KE4 GPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPGVAGPSGEP : :..: :. : :: : :: ::.:.::: : ::::: : : .: ::. ::.:. CCDS47 GDKGEKGARGLDGEPGPQGLPGAPGDQGQRGPPGEAGPKGDRGAEGARGIPGLPGPKGDT 240 250 260 270 280 290 310 320 330 340 350 360 pF1KE4 GMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGELGEA :.:: ::..:.::. : ::: :. : ::. : .::::.:: :.:: ::.: .: :. CCDS47 GLPGVDGRDGIPGMPGTKGEPGKPGPPGDAGLQGLPGVPGIPGAKGVAGEKGSTGAPGKP 300 310 320 330 340 350 370 380 390 400 410 420 pF1KE4 GPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLPGPQ : :. : ::. : ::: : : .: :. : :: :.::. : :. :: : :::::: CCDS47 GQMGNSGKPGQQGPPGEVGPRGPQGLPGSRGELGPVGSPGLPGKLGSLGSPGLPGLPGPP 360 370 380 390 400 410 430 440 450 460 470 480 pF1KE4 GLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPKGTQ :: : :::: : ::::.:: .: .: :..:::: :: :::: .:. :: : .: . CCDS47 GLPGMKGDRGVVGEPGPKGEQGASGEEGEAGERGELGDIGLPGPKGSAGNPGEPGLRGPE 420 430 440 450 460 470 490 500 510 520 530 540 pF1KE4 GPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQIAQL : : ::.: ::::: :.:: :. :. : : ::. ..:.:...: .:.:..:.. CCDS47 GSRGLPGVEGPRGPPGPRGVQGEQGATGLPGVQGPPGRAPTDQHIKQVCMRVIQEHFAEM 480 490 500 510 520 530 550 560 570 580 590 pF1KE4 AAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGA---RGPPGYRGPTGELGDPGPRGN :: :..: :. : :: :::::::::: : :: :: :: .:: : :: ::.:. CCDS47 AASLKRP-DSGATGLPGRPGPPGPPGPPGENGFPGQMGIRGLPGIKGPPGALGLRGPKGD 540 550 560 570 580 590 600 610 620 630 640 650 pF1KE4 QGDRGDKGAAGAGLDGPEGDQGPQGPQGVPGTSKDGQDGAPGEPGPPGD---PGLPGAIG :..:..: : : :.: : : : :: .. :..: :: :::: ::.:: : CCDS47 LGEKGERGPPGRG---PNGLPGAIGLPGDPGPASYGRNGRDGERGPPGVAGIPGVPGPPG 600 610 620 630 640 650 660 670 680 pF1KE4 AQGTPGICDTSACQGAVLGGVGEKSGSRSS : ::.:. ..: CCDS47 PPGLPGFCEPASCTMQAGQRAFNKGPDP 660 670 >>CCDS4971.1 COL9A1 gene_id:1297|Hs108|chr6 (921 aa) initn: 5409 init1: 1808 opt: 2368 Z-score: 928.6 bits: 182.7 E(32554): 2.1e-45 Smith-Waterman score: 2368; 50.9% identity (66.7% similar) in 645 aa overlap (29-667:272-906) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGLP : :::::::::: :: :::::. :: : : CCDS49 CDPLRPRRETCHELPARITPSQTTDERGPPGEQGPPGPPGPPGVPGIDGIDGDRGPKGPP 250 260 270 280 290 300 60 70 80 90 100 110 pF1KE4 GPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGPPGLGGKG ::::: : ::::: ::. :: ::.::::: :: :: :. :..: : :: :. :.: CCDS49 GPPGPAGEPGKPGAPGK---PGTPGADGLTGPDGSPGSIGSKGQKGEPGVPGSRGFPGRG 310 320 330 340 350 120 130 140 150 160 170 pF1KE4 LPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVLPEGATDLQCPSIC .::::: :..: :: .: :: : :: :::::::::: :.. : ::. : CCDS49 IPGPPGPPGTAGLPGELGRVGPVGD---PGRRGPPGPPGPPGPRGTIGFHDGDPLCPNAC 360 370 380 390 400 410 180 190 200 210 220 230 pF1KE4 PPGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGLQGPRGLRGLPG ::: : ::.::..: : ::: :: :..:.::. : : : :. : : .::::. : CCDS49 PPGRSGYPGLPGMRGHKGAKGEIGEPGRQGHKGEEGDQGELGEVGAQGPPGAQGLRGITG 420 430 440 450 460 470 240 250 260 270 280 290 pF1KE4 PLGPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPGVAGPSG .: :..: :. : :: : :: ::.:.::: : ::::: : : .: ::. ::.: CCDS49 IVGDKGEKGARGLDGEPGPQGLPGAPGDQGQRGPPGEAGPKGDRGAEGARGIPGLPGPKG 480 490 500 510 520 530 300 310 320 330 340 350 pF1KE4 EPGMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGELG . :.:: ::..:.::. : ::: :. : ::. : .::::.:: :.:: ::.: .: : CCDS49 DTGLPGVDGRDGIPGMPGTKGEPGKPGPPGDAGLQGLPGVPGIPGAKGVAGEKGSTGAPG 540 550 560 570 580 590 360 370 380 390 400 410 pF1KE4 EAGPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLPG . : :. : ::. : ::: : : .: :. : :: :.::. : :. :: : ::::: CCDS49 KPGQMGNSGKPGQQGPPGEVGPRGPQGLPGSRGELGPVGSPGLPGKLGSLGSPGLPGLPG 600 610 620 630 640 650 420 430 440 450 460 470 pF1KE4 PQGLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPKG : :: : :::: : ::::.:: .: .: :..:::: :: :::: .:. :: : .: CCDS49 PPGLPGMKGDRGVVGEPGPKGEQGASGEEGEAGERGELGDIGLPGPKGSAGNPGEPGLRG 660 670 680 690 700 710 480 490 500 510 520 530 pF1KE4 TQGPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQIA .: : ::.: ::::: :.:: :. :. : : ::. ..:.:...: .:.:..: CCDS49 PEGSRGLPGVEGPRGPPGPRGVQGEQGATGLPGVQGPPGRAPTDQHIKQVCMRVIQEHFA 720 730 740 750 760 770 540 550 560 570 580 590 pF1KE4 QLAAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGA---RGPPGYRGPTGELGDPGPR ..:: :..: . :. : :: :::::::::: : :: :: :: .:: : :: ::. CCDS49 EMAASLKRPDS-GATGLPGRPGPPGPPGPPGENGFPGQMGIRGLPGIKGPPGALGLRGPK 780 790 800 810 820 830 600 610 620 630 640 650 pF1KE4 GNQGDRGDKGAAGAGLDGPEGDQGPQGPQGVPGTSKDGQDGAPGEPGPPGD---PGLPGA :. :..:..: : ::.: : : : :: .. :..: :: :::: ::.:: CCDS49 GDLGEKGERGPPGR---GPNGLPGAIGLPGDPGPASYGRNGRDGERGPPGVAGIPGVPGP 840 850 860 870 880 890 660 670 680 pF1KE4 IGAQGTPGICDTSACQGAVLGGVGEKSGSRSS : : ::.:. ..: CCDS49 PGPPGLPGFCEPASCTMQAGQRAFNKGPDP 900 910 920 >>CCDS6376.1 COL22A1 gene_id:169044|Hs108|chr8 (1626 aa) initn: 3203 init1: 1217 opt: 1913 Z-score: 752.7 bits: 150.9 E(32554): 1.3e-35 Smith-Waterman score: 1949; 44.4% identity (56.1% similar) in 711 aa overlap (28-681:924-1622) 10 20 30 40 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPG----P-----PGKPGQDGI :: ::: :::: : ::: :. : CCDS63 QGPTGPPGAKGQEGAHGAPGAAGNPGAPGHVGAPGPSGPPGSVGAPGLRGTPGKDGERGE 900 910 920 930 940 950 50 60 70 80 90 pF1KE4 DGEAGPPGLPGPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGP------------ : :: : ::: ::.: :: :: :: : : : :: : : ::: CCDS63 KGAAGEEGSPGPVGPRGDPGAPGLPGPPG-KGKDGEPGLRGSPGLPGPLGTKAACGKVRG 960 970 980 990 1000 1010 100 110 120 130 pF1KE4 --------------KGAPGERGSLGPPGPPGLGGKGLPGPPGEAGVSGPPGGIGLRG--- .:::: :: : : ::.: : ::: : : .: ::. :: : CCDS63 SENCALGGQCVKGDRGAPGIPGSPGSRGDPGIGVAGPPGPSGPPGDKGSPGSRGLPGFPG 1020 1030 1040 1050 1060 1070 140 150 160 170 180 190 pF1KE4 PPGPSGLPGLPGPPGPPGPPGHPGV---LPEGATDLQCPSIC---PPGPPGPPGMPGFKG : ::.: : :: :: ::::.::. : : .: ..: :::::: ::.::::: CCDS63 PQGPAGRDGAPGNPGERGPPGKPGLSSLLSPGDINLLAKDVCNDCPPGPPGLPGLPGFKG 1080 1090 1100 1110 1120 1130 200 210 220 230 240 250 pF1KE4 PTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGLQGPRGLRGLPGPLGPPGDRG-P--IG : :. :. : .:.::. :::: : :: .: :: .: :: : .: ::.: : : CCDS63 DKGVPGKPGREGTEGKKGEAGPPGLPGPPGIAGPQGSQGERGADGEVGQKGDQGHPGVPG 1140 1150 1160 1170 1180 1190 260 270 280 290 300 pF1KE4 FRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPGVAGPSGEPGMPG---KDG : :::: :: :: : : :: :.. : :: .: :: :::: ::.:: :.: CCDS63 FMGPPGNPGPPGADGIAGAAGPPGIQ------GSPGKEGPPGPQGPSGLPGIPGEEGKEG 1200 1210 1220 1230 1240 310 320 330 340 350 360 pF1KE4 QNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGELGEAGPSGEPG ..: :: :. :.::. : :: .: : ::. :..:..: : ::..: .: : : :: CCDS63 RDGKPGPPGEPGKAGEPGLPGPEGARGPPGFKGHTGDSGAPGPRGESGAMGLPGQEGLPG 1250 1260 1270 1280 1290 1300 370 380 390 400 410 420 pF1KE4 VPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLPGPQGLRGDVG ::.: : .: : :: : : : :: :: : ::::: :. : :: :. : : CCDS63 KDGDTGPTGPQGPQGPRGPPGKNGSPGSPGEPGPSGTPGQKGSKGENGSPGLPGFLGPRG 1310 1320 1330 1340 1350 1360 430 440 450 460 470 480 pF1KE4 DRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPKGTQGPNGTSG : : : : .:. :. : :: ::: : :. : :: :..:. : : : .: .: CCDS63 PPGEPGEKGVPGKEGVPGKPGEPGFKGERGDPGIKGDKGPPGGKGQPGDPGIPGHKGHTG 1370 1380 1390 1400 1410 1420 490 500 510 520 530 540 pF1KE4 V---QGVPGPPGPLGLQGVPGVPGITGKPG-VPGKEASEQRIRELCGGMISEQIAQLAAH . ::.:: ::.: : :: ::. : : :. :. .. :.: : .. ..: : :. CCDS63 LMGPQGLPGENGPVGPPGPPGQPGFPGLRGESPSMETLRRLIQEELGKQLETRLAYLLAQ 1430 1440 1450 1460 1470 1480 550 560 570 580 590 600 pF1KE4 LRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGARGPPGYRGPTGELGDPGPRGNQGDRGD . .: ::::: :::: : :: : : : :: : : : ::.:..: .:: CCDS63 MPPAYMKSSQGRPGPPGPPGKDGLPGRAGPMGEPGRPGQGGLEGPSGPIGPKGERGAKGD 1490 1500 1510 1520 1530 1540 610 620 630 640 650 660 pF1KE4 KGAAGAGLDGPEGDQGPQGPQGVPGTSKDGQDGAPG---EPGPPGDPGLPGAIGAQGTPG :: :.:: : : : : : :: .::: : :: : :: : ::::: : :: CCDS63 PGAPGVGLRGEMGPPGIPGQPGEPGYAKDGLPGIPGPQGETGPAGHPGLPGP---PGPPG 1550 1560 1570 1580 1590 1600 670 680 pF1KE4 ICDTSACQGAVLGGVGEKSGSRSS :: : : : ..... . :. CCDS63 QCDPSQC--AYFASLAARPGNVKGP 1610 1620 >-- initn: 1203 init1: 620 opt: 1337 Z-score: 533.1 bits: 110.3 E(32554): 2.3e-23 Smith-Waterman score: 1373; 47.1% identity (57.0% similar) in 512 aa overlap (31-517:451-922) 10 20 30 40 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQD-------------- : :: :: : ::.. CCDS63 IVIYCDSRHAELETCCDIPSGPCQVTVVTEPPPPPPPQRPPTPGSEQIGFLKTINCSCPA 430 440 450 460 470 480 50 60 70 80 90 100 pF1KE4 GIDGE---AGPPGLPGPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGER : :: ::: ::::: : :: : : :: : : :. :: : .: ::. CCDS63 GEKGEMGVAGPMGLPGPKGDIGAIGPVGAPGPKGEKGDVGI-------GPFG-QGEKGEK 490 500 510 520 530 110 120 130 140 150 160 pF1KE4 GSLGPPGPPGL-GGKGLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHP :::: ::::: :.::. : ::: : : :: .:.::: :: :::::::: : :: CCDS63 GSLGLPGPPGRDGSKGMRGEPGELGEPGLPGEVGMRGPQGP---PGLPGPPGRVGAPGLQ 540 550 560 570 580 170 180 190 200 210 220 pF1KE4 GVLPEGATDLQCPSICPPGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLP : : .: . : : :: :: : : : .: .: .::::: :: :: :.: CCDS63 GERGEKGTRGEKGE---RGLDGFPGKPGDTGQQGRPGPSGVAGPQGEKGDVGPAGPPGVP 590 600 610 620 630 640 230 240 250 260 270 pF1KE4 GSVGLQ-GPRGLRGLPGPLG------PPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGF ::: : : .: .: ::: : ::: ::::: .: : :: : : .:. :: :. CCDS63 GSVVQQEGLKGEQGAPGPRGHQGAPGPPGARGPIGPEGRDGPPGLQGLRGKKGDMGPPGI 650 660 670 680 690 700 280 290 300 310 320 330 pF1KE4 RGPKGDLGRPGPKGTPGVAGPSGEPGMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGL : : : ::: :.:: ::.: ::.: :. : :: . : : .: ::. :::: CCDS63 PGLLGLQGPPGPPGVPGPPGPGGSPGLP---GEIGFPG---KPGPPGPTGPPGKDGPNGP 710 720 730 740 750 760 340 350 360 370 380 390 pF1KE4 PGLPGRAGSKGEKGERGRAGELGEAGPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGP :: :: .::: ::::. : :. : :: : : :: :::.:::: : : CCDS63 PGPPG---TKGEPGERGEDGLPGKPGLRGEIGEQGLAGRPGEKGEAGL--------P-GA 770 780 790 800 400 410 420 430 440 450 pF1KE4 PGAPGVRGFQGQKGSMGDPGLPGPQGLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGE :: ::::: .:..: :. :::: :.:: :: : ::: : :. :. .: . . CCDS63 PGFPGVRGEKGDQGEKGELGLPG---LKGD---RGEKGEAGPAGPPGLPGTTSLFTPHPR 810 820 830 840 850 860 460 470 480 490 500 510 pF1KE4 LGPSGLVGPKGESGSRGELGPKGTQGPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGV . : : :::::.:. : : : :: : : :: :::: : .:. :.:: .:.::. CCDS63 M-P-GEQGPKGEKGDPGLPGEPGLQGRPGELGPQGPTGPPGAKGQEGAHGAPGAAGNPGA 870 880 890 900 910 920 520 530 540 550 560 570 pF1KE4 PGKEASEQRIRELCGGMISEQIAQLAAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPG :: CCDS63 PGHVGAPGPSGPPGSVGAPGLRGTPGKDGERGEKGAAGEEGSPGPVGPRGDPGAPGLPGP 930 940 950 960 970 980 >>CCDS8759.1 COL2A1 gene_id:1280|Hs108|chr12 (1418 aa) initn: 1212 init1: 1212 opt: 1851 Z-score: 729.6 bits: 146.5 E(32554): 2.5e-34 Smith-Waterman score: 1916; 45.5% identity (54.4% similar) in 723 aa overlap (10-660:11-711) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGLPG .:: ::.. .: : : : ::: : : :: : .:: : :: CCDS87 MIRLGAPQTLVLLTLLVAAVLRCQG-QDVRQPGPKGQKGEPGD-----IKDIVGPKGPPG 10 20 30 40 50 60 70 80 90 100 110 pF1KE4 PPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGPPGLGGK-- : :: : : : :. : : :: :::: :: : :: : ::::::::::. CCDS87 PQGPAGEQGPRGDRGDKGEKGAPGP---RGRDGEPGTPGNPGPPGPPGPPGPPGLGGNFA 60 70 80 90 100 110 120 130 140 150 pF1KE4 ----G-------------LPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPG------ : . :: : : :::: : :: : .: :: :: :: CCDS87 AQMAGGFDEKAGGAQLGVMQGPMGPMGPRGPPGPAGAPGPQGFQGNPGEPGEPGVSGPMG 120 130 140 150 160 170 160 170 180 190 pF1KE4 ---PPGPPGHPGVLPEGATDLQCPSICPPGP------PGPPGMPGFKGPTGY------KG ::::::.:: :.. . :::: :: ::.:: :: :: :: CCDS87 PRGPPGPPGKPGDDGEAGKPGKAGERGPPGPQGARGFPGTPGLPGVKGHRGYPGLDGAKG 180 190 200 210 220 230 200 210 220 230 240 pF1KE4 E------QGEVGKDGEKGDPGPPGPAGLPGSVGLQGP------RGLRGLPGPLGPPGDRG : .:: :. ::.:.::: :: :::: : :: :: : ::: :::: : CCDS87 EAGAPGVKGESGSPGENGSPGPMGPRGLPGERGRTGPAGAAGARGNDGQPGPAGPPGPVG 240 250 260 270 280 290 250 260 270 280 290 300 pF1KE4 PIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPGVAGPSGEPGMPGKDG : : : :: ::: :.:: : ::::: .::.:. : :::: ::.: : :: :: CCDS87 PAGGPGFPGAPGAKGEAGPTGARGPEGAQGPRGE---P---GTPGSPGPAGASGNPGTDG 300 310 320 330 340 310 320 330 340 350 360 pF1KE4 QNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGELGEAGPSGEPG :. : : : :: : :: .:: : : : : ::. :: : :: :: ::.:::: CCDS87 IPGAKGSAGAPGIAGAPGFPGPRGPPGPQGATGPLGPKGQTGEPGIAGFKGEQGPKGEPG 350 360 370 380 390 400 370 380 390 400 410 420 pF1KE4 VPGDAGMPGERGEAGHRGS---AGALGPQGPPG---APGVRGFQGQKGSMGDPGLPGPQG : : :: :: :.::. :..:: :::: ::: ::: :: : : : :: .: CCDS87 PAGPQGAPGPAGEEGKRGARGEPGGVGPIGPPGERGAPGNRGFPGQDGLAGPKGAPGERG 410 420 430 440 450 460 430 440 450 460 470 480 pF1KE4 LRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPKGTQG : .: .: .: : :. :. :. :: : :. ::.: :::.: : :. :: : :: CCDS87 PSGLAGPKGANGDPGRPGEPGLPGARGLTGRPGDAGPQGKVGPSGAPGEDGRPGPPGPQG 470 480 490 500 510 520 490 500 510 520 530 pF1KE4 PNGTSGVQGVPGP------PGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISE : ::.: ::: :: : .:.::.::. : :: :. .. .: .: CCDS87 ARGQPGVMGFPGPKGANGEPGKAGEKGLPGAPGLRGLPGKDGETGAAGPPGP--AGPAGE 530 540 550 560 570 580 540 550 560 570 580 590 pF1KE4 QIAQLAAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGARGPPGYRGPTGELGDPGPR . : : : : : ::: :::: : ::. : :: : :: :: :: : :: : CCDS87 RGEQGA-----PGPSGFQGLPGPPGPPGEGGKPGDQGVPGEAGAPGLVGPRGERGFPGER 590 600 610 620 630 600 610 620 630 640 pF1KE4 GN---QGDRGDKGAAGA-GLDGPEGDQGPQGP---QGVPGTSK-DGQDGAPGEPGPPGDP :. :: .: .: :. : :::.: .:: :: :: :: . :. :: : :: :: CCDS87 GSPGAQGLQGPRGLPGTPGTDGPKGASGPAGPPGAQGPPGLQGMPGERGAAGIAGPKGDR 640 650 660 670 680 690 650 660 670 680 pF1KE4 GLPGAIGAQGTPGICDTSACQGAVLGGVGEKSGSRSS : : : .:.:: CCDS87 GDVGEKGPEGAPGKDGGRGLTGPIGPPGPAGANGEKGEVGPPGPAGSAGARGAPGERGET 700 710 720 730 740 750 >-- initn: 1123 init1: 1123 opt: 1323 Z-score: 528.3 bits: 109.2 E(32554): 4.1e-23 Smith-Waterman score: 1433; 47.3% identity (57.2% similar) in 488 aa overlap (29-515:717-1149) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGLP :: :: ::::: : :. : :.::: CCDS87 GAAGIAGPKGDRGDVGEKGPEGAPGKDGGRGLTGPIGPPGPAGANGEKG---EVGPP--- 690 700 710 720 730 740 60 70 80 90 100 110 pF1KE4 GPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGPPG-LGGK :: : :: : ::. ::.: :: : : : :: :: :: :: :. : : :: : . CCDS87 GPAGSAGARGAPGERGETGPPGPAGFAGPPGADGQPGAKGEQGEAGQKGDAGAPGPQGPS 750 760 770 780 790 800 120 130 140 150 160 170 pF1KE4 GLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVLPEGATDLQCPSI : ::: : .::.:: :. : .:::: .:.:: : :::: :.:: CCDS87 GAPGPQGPTGVTGPKGARGAQGPPGATGFPGAAGRVGPPGSNGNPG-------------- 810 820 830 840 180 190 200 210 220 230 pF1KE4 CPPGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGLQGPRGLRGLP ::::::: : : :: .:: :::: :: : ::::: : : CCDS87 -PPGPPGPSGKDGPKGA---------------RGDSGPPGRAGEP---GLQGPAGP---P 850 860 870 880 240 250 260 270 280 290 pF1KE4 GPLGPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPGVAGPS : : ::: :: : .:::: : : ::.:: : : :: .: : :::.: :: : CCDS87 GEKGEPGDDGPSGAEGPPG-P--QGLAGQRGIVGLPGQRGERGFPGLPGPSGEPGKQGAP 890 900 910 920 930 940 300 310 320 330 340 350 pF1KE4 GEPGMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGEL : : : : : ::: : :: ::.:.:: :: ::: :. : ::.::..: . CCDS87 GASGDRGPPGPVGPPGLTGPAGEPGREGSPGADGP------PGRDGAAGVKGDRGETGAV 950 960 970 980 990 360 370 380 390 400 410 pF1KE4 GEAGPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLP : : : :: :: :: :..:. :. :. : .::.:: :: :..: :: .:. :. : : CCDS87 GAPGAPGPPGSPGPAGPTGKQGDRGEAGAQGPMGPSGPAGARGIQGPQGPRGDKGEAGEP 1000 1010 1020 1030 1040 1050 420 430 440 450 460 470 pF1KE4 GPQGLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPK : .::.: : : : :: : .: :..: : .: :: : :::.:..:. : :: CCDS87 GERGLKGHRGFTGLQGLPGPPGPSGDQGASGPAGPSGPRGPPGPVGPSGKDGANGIPGPI 1060 1070 1080 1090 1100 1110 480 490 500 510 520 530 pF1KE4 GTQGPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQI : :: : :: : :::: : : :: :: ::. CCDS87 GPPGPRGRSGETGPAGPPGNPGPPGPPGPPG----PGIDMSAFAGLGPREKGPDPLQYMR 1120 1130 1140 1150 1160 1170 540 550 560 570 580 590 pF1KE4 AQLAAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGARGPPGYRGPTGELGDPGPRGN CCDS87 ADQAAGGLRQHDAEVDATLKSLNNQIESIRSPEGSRKNPARTCRDLKLCHPEWKSGDYWI 1180 1190 1200 1210 1220 1230 >>CCDS41778.1 COL2A1 gene_id:1280|Hs108|chr12 (1487 aa) initn: 1212 init1: 1212 opt: 1851 Z-score: 729.4 bits: 146.5 E(32554): 2.6e-34 Smith-Waterman score: 1911; 46.4% identity (55.5% similar) in 685 aa overlap (28-660:116-780) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGL :: ::::: :: :. : : :. : : CCDS41 CPICPTDLATASGQPGPKGQKGEPGDIKDIVGPKGPPGPQGPAGEQGPRGDRGDKGEKGA 90 100 110 120 130 140 60 70 80 90 pF1KE4 PGPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGR---------------------DGPPGP ::: : : :: ::.:: : :: :: :: : .:: :: CCDS41 PGPRGRDGEPGTPGNPGPPGPPGPPGPPGLGGNFAAQMAGGFDEKAGGAQLGVMQGPMGP 150 160 170 180 190 200 100 110 120 130 140 150 pF1KE4 KGAPGERGSLGPPGPPGLGGKGLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPP : : : : ::: :. :. :: ::: ::::: .: :::::: : :: : : : CCDS41 MGPRGPPGPAGAPGPQGFQGN--PGEPGEPGVSGP---MGPRGPPGPPGKPGDDGEAGKP 210 220 230 240 250 260 160 170 180 190 200 210 pF1KE4 GPPGH---PGVLPEGATDLQCPSICP--PGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKG : :. :: :.:: . : : : ::. : :: .: : .:: :. ::.: CCDS41 GKAGERGPPG--PQGARGFPGTPGLPGVKGHRGYPGLDGAKGEAGAPGVKGESGSPGENG 270 280 290 300 310 220 230 240 250 260 pF1KE4 DPGPPGPAGLPGSVGLQGP------RGLRGLPGPLGPPGDRGPIGFRGPPGIPGAPGKAG .::: :: :::: : :: :: : ::: :::: :: : : :: ::: :.:: CCDS41 SPGPMGPRGLPGERGRTGPAGAAGARGNDGQPGPAGPPGPVGPAGGPGFPGAPGAKGEAG 320 330 340 350 360 370 270 280 290 300 310 320 pF1KE4 DRGERGPEGFRGPKGDLGRPGPKGTPGVAGPSGEPGMPGKDGQNGVPGLDGQKGEAGRNG : ::::: .::.:. : :::: ::.: : :: :: :. : : : :: : CCDS41 PTGARGPEGAQGPRGE-----P-GTPGSPGPAGASGNPGTDGIPGAKGSAGAPGIAGAPG 380 390 400 410 420 430 330 340 350 360 370 380 pF1KE4 APGEKGPNGLPGLPGRAGSKGEKGERGRAGELGEAGPSGEPGVPGDAGMPGERGEAGHRG :: .:: : : : : ::. :: : :: :: ::.:::: : : :: :: :.:: CCDS41 FPGPRGPPGPQGATGPLGPKGQTGEPGIAGFKGEQGPKGEPGPAGPQGAPGPAGEEGKRG 440 450 460 470 480 490 390 400 410 420 430 pF1KE4 S---AGALGPQGPPG---APGVRGFQGQKGSMGDPGLPGPQGLRGDVGDRGPGGAAGPKG . :..:: :::: ::: ::: :: : : : :: .: : .: .: .: : : CCDS41 ARGEPGGVGPIGPPGERGAPGNRGFPGQDGLAGPKGAPGERGPSGLAGPKGANGDPGRPG 500 510 520 530 540 550 440 450 460 470 480 490 pF1KE4 DQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPKGTQGPNGTSGVQGVPGP----- . :. :. :: : :. ::.: :::.: : :. :: : :: : ::.: ::: CCDS41 EPGLPGARGLTGRPGDAGPQGKVGPSGAPGEDGRPGPPGPQGARGQPGVMGFPGPKGANG 560 570 580 590 600 610 500 510 520 530 540 550 pF1KE4 -PGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQIAQLAAHLRKPLAPGSI :: : .:.::.::. : :: :. .. .: .:. : : : : CCDS41 EPGKAGEKGLPGAPGLRGLPGKDGETGAAGPPGP--AGPAGERGEQGA-----PGPSGFQ 620 630 640 650 660 560 570 580 590 600 pF1KE4 GRPGPAGPPGPPGPPGSIGHPGARGPPGYRGPTGELGDPGPRGN---QGDRGDKGAAGA- : ::: :::: : ::. : :: : :: :: :: : :: ::. :: .: .: :. CCDS41 GLPGPPGPPGEGGKPGDQGVPGEAGAPGLVGPRGERGFPGERGSPGAQGLQGPRGLPGTP 670 680 690 700 710 720 610 620 630 640 650 660 pF1KE4 GLDGPEGDQGPQGP---QGVPGTSK-DGQDGAPGEPGPPGDPGLPGAIGAQGTPGICDTS : :::.: .:: :: :: :: . :. :: : :: :: : : : .:.:: CCDS41 GTDGPKGASGPAGPPGAQGPPGLQGMPGERGAAGIAGPKGDRGDVGEKGPEGAPGKDGGR 730 740 750 760 770 780 670 680 pF1KE4 ACQGAVLGGVGEKSGSRSS CCDS41 GLTGPIGPPGPAGANGEKGEVGPPGPAGSAGARGAPGERGETGPPGPAGFAGPPGADGQP 790 800 810 820 830 840 >-- initn: 1123 init1: 1123 opt: 1323 Z-score: 528.1 bits: 109.3 E(32554): 4.3e-23 Smith-Waterman score: 1433; 47.3% identity (57.2% similar) in 488 aa overlap (29-515:786-1218) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGLP :: :: ::::: : :. : :.::: CCDS41 GAAGIAGPKGDRGDVGEKGPEGAPGKDGGRGLTGPIGPPGPAGANGEKG---EVGPP--- 760 770 780 790 800 60 70 80 90 100 110 pF1KE4 GPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGPPG-LGGK :: : :: : ::. ::.: :: : : : :: :: :: :: :. : : :: : . CCDS41 GPAGSAGARGAPGERGETGPPGPAGFAGPPGADGQPGAKGEQGEAGQKGDAGAPGPQGPS 810 820 830 840 850 860 120 130 140 150 160 170 pF1KE4 GLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVLPEGATDLQCPSI : ::: : .::.:: :. : .:::: .:.:: : :::: :.:: CCDS41 GAPGPQGPTGVTGPKGARGAQGPPGATGFPGAAGRVGPPGSNGNPG-------------- 870 880 890 900 910 180 190 200 210 220 230 pF1KE4 CPPGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGLQGPRGLRGLP ::::::: : : :: .:: :::: :: : ::::: : : CCDS41 -PPGPPGPSGKDGPKGA---------------RGDSGPPGRAGEP---GLQGPAGP---P 920 930 940 950 240 250 260 270 280 290 pF1KE4 GPLGPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPGVAGPS : : ::: :: : .:::: : : ::.:: : : :: .: : :::.: :: : CCDS41 GEKGEPGDDGPSGAEGPPG-P--QGLAGQRGIVGLPGQRGERGFPGLPGPSGEPGKQGAP 960 970 980 990 1000 1010 300 310 320 330 340 350 pF1KE4 GEPGMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGEL : : : : : ::: : :: ::.:.:: :: ::: :. : ::.::..: . CCDS41 GASGDRGPPGPVGPPGLTGPAGEPGREGSPGADGP------PGRDGAAGVKGDRGETGAV 1020 1030 1040 1050 1060 360 370 380 390 400 410 pF1KE4 GEAGPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLP : : : :: :: :: :..:. :. :. : .::.:: :: :..: :: .:. :. : : CCDS41 GAPGAPGPPGSPGPAGPTGKQGDRGEAGAQGPMGPSGPAGARGIQGPQGPRGDKGEAGEP 1070 1080 1090 1100 1110 1120 420 430 440 450 460 470 pF1KE4 GPQGLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPK : .::.: : : : :: : .: :..: : .: :: : :::.:..:. : :: CCDS41 GERGLKGHRGFTGLQGLPGPPGPSGDQGASGPAGPSGPRGPPGPVGPSGKDGANGIPGPI 1130 1140 1150 1160 1170 1180 480 490 500 510 520 530 pF1KE4 GTQGPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQI : :: : :: : :::: : : :: :: ::. CCDS41 GPPGPRGRSGETGPAGPPGNPGPPGPPGPPG----PGIDMSAFAGLGPREKGPDPLQYMR 1190 1200 1210 1220 1230 1240 540 550 560 570 580 590 pF1KE4 AQLAAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGARGPPGYRGPTGELGDPGPRGN CCDS41 ADQAAGGLRQHDAEVDATLKSLNNQIESIRSPEGSRKNPARTCRDLKLCHPEWKSGDYWI 1250 1260 1270 1280 1290 1300 >>CCDS75932.1 COL5A1 gene_id:1289|Hs108|chr9 (1838 aa) initn: 1227 init1: 1227 opt: 1838 Z-score: 723.5 bits: 145.7 E(32554): 5.6e-34 Smith-Waterman score: 1974; 46.0% identity (57.2% similar) in 689 aa overlap (29-659:919-1577) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGLP : :: : : :: :..: :: ::::: CCDS75 GFPGANGEKGGRGTPGKPGPRGQRGPTGPRGERGPRGITGKPGPKGNSGGDGPAGPPGER 890 900 910 920 930 940 60 70 80 90 100 pF1KE4 GP---------PGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPP :: ::::: :: ::: : : :: : :. :. ::::: :. : .: : CCDS75 GPNGPQGPTGFPGPKGPPGPPGKDGLPGHPGQRGETGFQGKTGPPGPPGVVGPQGPTGET 950 960 970 980 990 1000 110 120 130 140 150 160 pF1KE4 GPPGLGGK-GLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVL-PE :: : :. : :::::: :. : : : .: :::.:::: :::: : :: :. : CCDS75 GPMGERGHPGPPGPPGEQGLPGLAGKEGTKGDPGPAGLPGKDGPPGLRGFPGDRGLPGPV 1010 1020 1030 1040 1050 1060 170 180 190 200 210 220 pF1KE4 GATDLQCPSICPPGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGL :: :. . ::::::: : :: .::.: : : :. : .: ::: : : :: : CCDS75 GALGLK-GNEGPPGPPGPAGSPGERGPAGAAGPIGIPGRPGPQGPPGPAGEKGAPGEKGP 1070 1080 1090 1100 1110 1120 230 240 250 260 270 280 pF1KE4 QGPRGLRGLPGPLGPPGDRGPIGF------RGPPGIPGAPGKAGDRGERGPEGFRGPKGD ::: : :: ::.: :: ::.: .: : :: :. ::.::.:: : ::.: CCDS75 QGPAGRDGLQGPVGLPGPAGPVGPPGEDGDKGEIGEPGQKGSKGDKGEQGPPGPTGPQGP 1130 1140 1150 1160 1170 1180 290 300 310 320 pF1KE4 LGRPGPKGTPGVAGPSGE---------------PGMPGKDGQNGVPGLDGQKGEAGR--- .:.:::.:. : :: :. :: :: : .:.:: :.:::.: CCDS75 IGQPGPSGADGEPGPRGQQGLFGQKGDEGPRGFPGPPGPVGLQGLPGPPGEKGETGDVGQ 1190 1200 1210 1220 1230 1240 330 340 350 360 370 pF1KE4 ---NGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGELGEAGPSGEPGVPGDAGMPG---E : :: .::.: :: : : : :. : .:: :: : .::::.::..: :: : CCDS75 MGPPGPPGPRGPSGAPGADGPQGPPGGIGNPGAVGEKGEPGEAGEPGLPGEGGPPGPKGE 1250 1260 1270 1280 1290 1300 380 390 400 410 420 430 pF1KE4 RGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLPGPQGLRGDVGDRGPGGAAGP ::: :. : .:: :: :: : :: : .:. : .: :: ::: : : .:. :: : CCDS75 RGEKGESGPSGAAGPPGPKGPPGDDGPKGSPGPVGFPGDPGPPGEPGPAGQDGP---PGD 1310 1320 1330 1340 1350 1360 440 450 460 470 480 490 pF1KE4 KGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPKGTQGPNGTSGVQGVPGPPG- :::.: :. : :: :: :::: : :. : : ::.: :: .:..: :. :::: CCDS75 KGDDGEPGQTGSPGPTGEPGPSG---PPGKRGPPGPAGPEGRQGEKGAKGEAGLEGPPGK 1370 1380 1390 1400 1410 1420 500 510 520 530 540 550 pF1KE4 --PLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQIAQLAAHLRKPLAPGSIG :.: ::.:: :: : :.:: ..:: :. : .:: : CCDS75 TGPIGPQGAPGKPGPDGLRGIPGP-VGEQ-------GL--------------PGSPGPDG 1430 1440 1450 560 570 580 590 600 pF1KE4 RPGPAGPPGPPG------PPGSIGHPGARG---PPGYRGPTGELGDPGPRGNQGDRGDKG ::: :::: :: : : :::: : ::: .: :. : :::.:..: .:..: CCDS75 PPGPMGPPGLPGLKGDSGPKGEKGHPGLIGLIGPPGEQGEKGDRGLPGPQGSSGPKGEQG 1460 1470 1480 1490 1500 1510 610 620 630 640 650 660 pF1KE4 AAGA----GLDGPEGDQGPQGPQGVPGTS-KDGQDGAPGEPGPPGDPGLPGAIGAQGTPG .: : :: : :: ::.:. :.: : : :.::::: :: :: . : : CCDS75 ITGPSGPIGPPGPPGLPGPPGPKGAKGSSGPTGPKGEAGHPGPPGPPGPPGEV-IQPLPI 1520 1530 1540 1550 1560 1570 670 680 pF1KE4 ICDTSACQGAVLGGVGEKSGSRSS CCDS75 QASRTRRNIDASQLLDDGNGENYVDYADGMEEIFGSLNSLKLEIEQMKRPLGTQQNPART 1580 1590 1600 1610 1620 1630 >-- initn: 2000 init1: 1027 opt: 1292 Z-score: 515.4 bits: 107.2 E(32554): 2.2e-22 Smith-Waterman score: 1360; 45.3% identity (55.9% similar) in 488 aa overlap (29-505:444-916) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKP-----GQ--DGIDGE :. :: : : :.: :. .: : CCDS75 ENYYDPYYDPTSSPSEIGPGMPANQDTIYEGIGGPRGEKGQKGEPAIIEPGMLIEGPPGP 420 430 440 450 460 470 60 70 80 90 100 110 pF1KE4 AGPPGLPGPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGP :: ::::::: : :. : ::: : :: ::. : : :::: : .: : CCDS75 EGPAGLPGPPGTMGPTGQVGDPGERGPPGRPGLPGADGLPGPPGTMLMLPFR--FGGGGD 480 490 500 510 520 530 120 130 140 150 160 pF1KE4 PGLGGKGLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGP---PGHPG-VLPE : : . . ..: . . ..:::: :: :: : ::: :::: :.:: : :. CCDS75 AGSKGPMVSAQESQAQAILQQARLALRGPAGPMGLTGRPGPVGPPGSGGLKGEPGDVGPQ 540 550 560 570 580 590 170 180 190 200 210 220 pF1KE4 GATDLQCPSICPPGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGL : .: ::::: : :: .: .: : .: :. : ::: : : ::::: : CCDS75 GPRGVQ-------GPPGPAGKPGRRGRAGSDGARGMPGQTGPKGDRGFDGLAGLPGEKGH 600 610 620 630 640 230 240 250 260 270 280 pF1KE4 QGPRGLRGLPGPLGPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGP .: : : ::: : :.:: : :: :.:: :: : : .:: : :: : : : CCDS75 RGDPGPSGPPGPPGDDGERGDDGEVGPRGLPGEPGPRGLLGPKGPPGPPGPPGVTGMDGQ 650 660 670 680 690 700 290 300 310 320 330 340 pF1KE4 KGTPGVAGPSGEPGMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGE : : .::.:::: : ::.: :: .: : : : :::::: : ::::: :. : CCDS75 PGPKGNVGPQGEPGPP---GQQGNPGAQGLPGPQGAIGPPGEKGPLGKPGLPGMPGADGP 710 720 730 740 750 760 350 360 370 380 390 400 pF1KE4 KGERGRAGELGEAGPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQ :. :. : :: : .: :: : :.:: :: : : : : .: : : ::.:. CCDS75 PGHPGKEGPPGEKGGQGPPGPQGPIGYPGPRGVKGADGIRGLKGTKGEKGEDGFPGFKGD 770 780 790 800 810 820 410 420 430 440 450 460 pF1KE4 KGSMGDPGLPGPQGLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGE : :: : :: : ::. : .:: : .::.:: : : ::.::.:: :: : :. CCDS75 MGIKGDRGEIGPPGPRGEDGPEGPKGRGGPNGDPGPLGP---PGEKGKLGVPGLPGYPGR 830 840 850 860 870 470 480 490 500 510 520 pF1KE4 SGSRGELGPKGTQGPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRE .: .: .: : : :: .: .:.:: ::: : .: : CCDS75 QGPKGSIGFPGFPGANGEKGGRGTPGKPGPRGQRGPTGPRGERGPRGITGKPGPKGNSGG 880 890 900 910 920 930 530 540 550 560 570 580 pF1KE4 LCGGMISEQIAQLAAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGARGPPGYRGPTG CCDS75 DGPAGPPGERGPNGPQGPTGFPGPKGPPGPPGKDGLPGHPGQRGETGFQGKTGPPGPPGV 940 950 960 970 980 990 >>CCDS6982.1 COL5A1 gene_id:1289|Hs108|chr9 (1838 aa) initn: 1227 init1: 1227 opt: 1838 Z-score: 723.5 bits: 145.7 E(32554): 5.6e-34 Smith-Waterman score: 1974; 46.0% identity (57.2% similar) in 689 aa overlap (29-659:919-1577) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKPGQDGIDGEAGPPGLP : :: : : :: :..: :: ::::: CCDS69 GFPGANGEKGGRGTPGKPGPRGQRGPTGPRGERGPRGITGKPGPKGNSGGDGPAGPPGER 890 900 910 920 930 940 60 70 80 90 100 pF1KE4 GP---------PGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPP :: ::::: :: ::: : : :: : :. :. ::::: :. : .: : CCDS69 GPNGPQGPTGFPGPKGPPGPPGKDGLPGHPGQRGETGFQGKTGPPGPPGVVGPQGPTGET 950 960 970 980 990 1000 110 120 130 140 150 160 pF1KE4 GPPGLGGK-GLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVL-PE :: : :. : :::::: :. : : : .: :::.:::: :::: : :: :. : CCDS69 GPMGERGHPGPPGPPGEQGLPGLAGKEGTKGDPGPAGLPGKDGPPGLRGFPGDRGLPGPV 1010 1020 1030 1040 1050 1060 170 180 190 200 210 220 pF1KE4 GATDLQCPSICPPGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGL :: :. . ::::::: : :: .::.: : : :. : .: ::: : : :: : CCDS69 GALGLK-GNEGPPGPPGPAGSPGERGPAGAAGPIGIPGRPGPQGPPGPAGEKGAPGEKGP 1070 1080 1090 1100 1110 1120 230 240 250 260 270 280 pF1KE4 QGPRGLRGLPGPLGPPGDRGPIGF------RGPPGIPGAPGKAGDRGERGPEGFRGPKGD ::: : :: ::.: :: ::.: .: : :: :. ::.::.:: : ::.: CCDS69 QGPAGRDGLQGPVGLPGPAGPVGPPGEDGDKGEIGEPGQKGSKGDKGEQGPPGPTGPQGP 1130 1140 1150 1160 1170 1180 290 300 310 320 pF1KE4 LGRPGPKGTPGVAGPSGE---------------PGMPGKDGQNGVPGLDGQKGEAGR--- .:.:::.:. : :: :. :: :: : .:.:: :.:::.: CCDS69 IGQPGPSGADGEPGPRGQQGLFGQKGDEGPRGFPGPPGPVGLQGLPGPPGEKGETGDVGQ 1190 1200 1210 1220 1230 1240 330 340 350 360 370 pF1KE4 ---NGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGELGEAGPSGEPGVPGDAGMPG---E : :: .::.: :: : : : :. : .:: :: : .::::.::..: :: : CCDS69 MGPPGPPGPRGPSGAPGADGPQGPPGGIGNPGAVGEKGEPGEAGEPGLPGEGGPPGPKGE 1250 1260 1270 1280 1290 1300 380 390 400 410 420 430 pF1KE4 RGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLPGPQGLRGDVGDRGPGGAAGP ::: :. : .:: :: :: : :: : .:. : .: :: ::: : : .:. :: : CCDS69 RGEKGESGPSGAAGPPGPKGPPGDDGPKGSPGPVGFPGDPGPPGEPGPAGQDGP---PGD 1310 1320 1330 1340 1350 1360 440 450 460 470 480 490 pF1KE4 KGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPKGTQGPNGTSGVQGVPGPPG- :::.: :. : :: :: :::: : :. : : ::.: :: .:..: :. :::: CCDS69 KGDDGEPGQTGSPGPTGEPGPSG---PPGKRGPPGPAGPEGRQGEKGAKGEAGLEGPPGK 1370 1380 1390 1400 1410 1420 500 510 520 530 540 550 pF1KE4 --PLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQIAQLAAHLRKPLAPGSIG :.: ::.:: :: : :.:: ..:: :. : .:: : CCDS69 TGPIGPQGAPGKPGPDGLRGIPGP-VGEQ-------GL--------------PGSPGPDG 1430 1440 1450 560 570 580 590 600 pF1KE4 RPGPAGPPGPPG------PPGSIGHPGARG---PPGYRGPTGELGDPGPRGNQGDRGDKG ::: :::: :: : : :::: : ::: .: :. : :::.:..: .:..: CCDS69 PPGPMGPPGLPGLKGDSGPKGEKGHPGLIGLIGPPGEQGEKGDRGLPGPQGSSGPKGEQG 1460 1470 1480 1490 1500 1510 610 620 630 640 650 660 pF1KE4 AAGA----GLDGPEGDQGPQGPQGVPGTS-KDGQDGAPGEPGPPGDPGLPGAIGAQGTPG .: : :: : :: ::.:. :.: : : :.::::: :: :: . : : CCDS69 ITGPSGPIGPPGPPGLPGPPGPKGAKGSSGPTGPKGEAGHPGPPGPPGPPGEV-IQPLPI 1520 1530 1540 1550 1560 1570 670 680 pF1KE4 ICDTSACQGAVLGGVGEKSGSRSS CCDS69 QASRTRRNIDASQLLDDGNGENYVDYADGMEEIFGSLNSLKLEIEQMKRPLGTQQNPART 1580 1590 1600 1610 1620 1630 >-- initn: 2000 init1: 1027 opt: 1292 Z-score: 515.4 bits: 107.2 E(32554): 2.2e-22 Smith-Waterman score: 1360; 45.3% identity (55.9% similar) in 488 aa overlap (29-505:444-916) 10 20 30 40 50 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGLPGPPGPPGPPGKP-----GQ--DGIDGE :. :: : : :.: :. .: : CCDS69 ENYYDPYYDPTSSPSEIGPGMPANQDTIYEGIGGPRGEKGQKGEPAIIEPGMLIEGPPGP 420 430 440 450 460 470 60 70 80 90 100 110 pF1KE4 AGPPGLPGPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGERGSLGPPGP :: ::::::: : :. : ::: : :: ::. : : :::: : .: : CCDS69 EGPAGLPGPPGTMGPTGQVGDPGERGPPGRPGLPGADGLPGPPGTMLMLPFR--FGGGGD 480 490 500 510 520 530 120 130 140 150 160 pF1KE4 PGLGGKGLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGP---PGHPG-VLPE : : . . ..: . . ..:::: :: :: : ::: :::: :.:: : :. CCDS69 AGSKGPMVSAQESQAQAILQQARLALRGPAGPMGLTGRPGPVGPPGSGGLKGEPGDVGPQ 540 550 560 570 580 590 170 180 190 200 210 220 pF1KE4 GATDLQCPSICPPGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGL : .: ::::: : :: .: .: : .: :. : ::: : : ::::: : CCDS69 GPRGVQ-------GPPGPAGKPGRRGRAGSDGARGMPGQTGPKGDRGFDGLAGLPGEKGH 600 610 620 630 640 230 240 250 260 270 280 pF1KE4 QGPRGLRGLPGPLGPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGP .: : : ::: : :.:: : :: :.:: :: : : .:: : :: : : : CCDS69 RGDPGPSGPPGPPGDDGERGDDGEVGPRGLPGEPGPRGLLGPKGPPGPPGPPGVTGMDGQ 650 660 670 680 690 700 290 300 310 320 330 340 pF1KE4 KGTPGVAGPSGEPGMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGE : : .::.:::: : ::.: :: .: : : : :::::: : ::::: :. : CCDS69 PGPKGNVGPQGEPGPP---GQQGNPGAQGLPGPQGAIGPPGEKGPLGKPGLPGMPGADGP 710 720 730 740 750 760 350 360 370 380 390 400 pF1KE4 KGERGRAGELGEAGPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQ :. :. : :: : .: :: : :.:: :: : : : : .: : : ::.:. CCDS69 PGHPGKEGPPGEKGGQGPPGPQGPIGYPGPRGVKGADGIRGLKGTKGEKGEDGFPGFKGD 770 780 790 800 810 820 410 420 430 440 450 460 pF1KE4 KGSMGDPGLPGPQGLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGE : :: : :: : ::. : .:: : .::.:: : : ::.::.:: :: : :. CCDS69 MGIKGDRGEIGPPGPRGEDGPEGPKGRGGPNGDPGPLGP---PGEKGKLGVPGLPGYPGR 830 840 850 860 870 470 480 490 500 510 520 pF1KE4 SGSRGELGPKGTQGPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRE .: .: .: : : :: .: .:.:: ::: : .: : CCDS69 QGPKGSIGFPGFPGANGEKGGRGTPGKPGPRGQRGPTGPRGERGPRGITGKPGPKGNSGG 880 890 900 910 920 930 530 540 550 560 570 580 pF1KE4 LCGGMISEQIAQLAAHLRKPLAPGSIGRPGPAGPPGPPGPPGSIGHPGARGPPGYRGPTG CCDS69 DGPAGPPGERGPNGPQGPTGFPGPKGPPGPPGKDGLPGHPGQRGETGFQGKTGPPGPPGV 940 950 960 970 980 990 >>CCDS33350.1 COL5A2 gene_id:1290|Hs108|chr2 (1499 aa) initn: 1210 init1: 1210 opt: 1833 Z-score: 722.5 bits: 145.2 E(32554): 6.3e-34 Smith-Waterman score: 1873; 46.6% identity (57.6% similar) in 672 aa overlap (24-669:205-852) 10 20 30 40 pF1KE4 MAGPRACAPLLLLLLLGELLAAAGAQRVGL-PG---PPGPPGPPGKPGQDGID :.: ::: :: : :: :: : ::.: CCDS33 PPGHPSHPGPDGLSRPFSAQMAGLDEKSGLGSQ-VGLMPGSVGPVGPRGPQGLQGQQGGA 180 190 200 210 220 230 50 60 70 80 90 100 pF1KE4 GEAGPPGLPGPPGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGP---KGAPGERGSL : .:::: :: ::: : :. : : :: :: :: ::.: :: :.:: :: CCDS33 GPTGPPGEPGDPGPMGPIGSRGPEGP---PGKPGEDGEPGRNGNPGEVGFAGSPGARGFP 240 250 260 270 280 290 110 120 130 140 150 pF1KE4 GPPGPPGLGG----KGLPGPPGEAGV------SGPPGGIGLRGPPGPSGLPGLPGPPGPP : :: ::: : ::: :: ::.:. .:: : .: :: :: :.:: : :: CCDS33 GAPGLPGLKGHRGHKGLEGPKGEVGAPGSKGEAGPTGPMGAMGPLGPRGMPGERGRLGPQ 300 310 320 330 340 350 160 170 180 190 200 210 pF1KE4 GPPGHPGV--LPEGATDLQCPSICP--PGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGD : ::. :. .: : . : : : :: ::: : :::: .: .: :. :: : CCDS33 GAPGQRGAHGMP-GKPGPMGPLGIPGSSGFPGNPGMKGEAGPTGARGPEGPQGQRGETGP 360 370 380 390 400 220 230 240 250 260 270 pF1KE4 PGPPGPAGLPGSVGLQGPRGLRGLPGPLGPPGDRGPIGFRGPPGIPGAPGKAGDRGERGP ::: : ::::..: .: : .: : : :: :: : :::: :: :..: .: :: CCDS33 PGPVGSPGLPGAIGTDGTPGAKG---PTGSPGTSGPPGSAGPPGSPGPQGSTGPQGIRGQ 410 420 430 440 450 460 280 290 300 310 320 pF1KE4 EGFRGPKGDLGRPGPKGTPG---VAGPSGEPGMPGKDGQNGVPGLDGQKGEAGRNGAPGE : : : :. :::: :: . :: : :: :: : : :: : : .:. ::::. CCDS33 PGDPGVPGFKGEAGPKGEPGPHGIQGPIGPPGEEGKRGPRGDPGTVGPPGPVGERGAPGN 470 480 490 500 510 520 330 340 350 360 370 380 pF1KE4 KGPNGLPGLPGRAGSKGEKGERGRAGELGEAGPSGEPGVPGDAGMPGERGEAGHRGSAGA .: : :::: :: .:::: .: : : .:.:: ::. :.:: :: .:. : : CCDS33 RGFPGSDGLPG---PKGAQGERGPVGSSGPKGSQGDPGRPGEPGLPGARGLTGNPGVQGP 530 540 550 560 570 580 390 400 410 420 430 440 pF1KE4 LGPQGPPGAPGVRGFQGQKGSMGDPGLPGPQGLRGDVGDRGPGGAAGPKGDQGIAGSDGL : :: :::: : : ::.: : :: .:: : :. : : : :. :. :. : CCDS33 EGKLGPLGAPGEDGRPGPPGSIGIRGQPGSMGLPGPKGSSGDPGKPGEAGNAGVPGQRGA 590 600 610 620 630 640 450 460 470 480 490 500 pF1KE4 PGDKGELGPSGLVGPKGESGSRGELGPKGTQGPNGTSGVQGVPGPPGPLGLQGVPGVPGI :: ::.:::: ::: : .: ::: ::: : .: ::.:::::: : : :: :. CCDS33 PGKDGEVGPSGPVGPPGLAGERGE------QGPPGPTGFQGLPGPPGPPGEGGKPGDQGV 650 660 670 680 690 510 520 530 540 550 560 pF1KE4 TGKPGVPGKEASEQRIRELCGGMISEQIAQLAAHLRKPLAPGSIGRPGPAGPPGPPGPPG : ::. : . . . : : :. : . .: .: : : :: : ::: : :: CCDS33 PGDPGAVGPLGPRGE-RGNPGERGEPGITGLPG--EKGMAGGH-GPDGPKGSPGPSGTPG 700 710 720 730 740 750 570 580 590 600 610 620 pF1KE4 SIGHPGARGPPGYRGPTGELGDPGPRGNQGDRGDKGAAG-AGLDGPEGDQGPQGPQGVPG . : :: .: :: :: .: :::.:..: :.::: : :: :: .: :: :: : : CCDS33 DTGPPGLQGMPGERGIAGT---PGPKGDRGGIGEKGAEGTAGNDGARGLPGPLGPPGPAG 760 770 780 790 800 810 630 640 650 660 670 680 pF1KE4 -TSKDGQDGAPGEPGPPGDPGLPGAIGAQGTPGICDTSACQGAVLGGVGEKSGSRSS :.. :. : : ::::. : ::. : .: : .. :: CCDS33 PTGEKGEPGPRGLVGPPGSRGNPGSRGENGPTGAVGFAGPQGPDGQPGVKGEPGEPGQKG 820 830 840 850 860 870 CCDS33 DAGSPGPQGLAGSPGPHGPNGVPGLKGGRGTQGPPGATGFPGSAGRVGPPGPAGAPGPAG 880 890 900 910 920 930 >-- initn: 1089 init1: 1089 opt: 1128 Z-score: 453.8 bits: 95.5 E(32554): 5.9e-19 Smith-Waterman score: 1208; 40.3% identity (51.0% similar) in 553 aa overlap (91-632:854-1297) 70 80 90 100 110 pF1KE4 PGPKGAPGKPGKPGEAGLPGLPGVDGLTGRDGPPGPKGAPGE---RGSLGPPGPPGLGGK :: :: :: ::: .:. : ::: ::.:. CCDS33 VGPPGSRGNPGSRGENGPTGAVGFAGPQGPDGQPGVKGEPGEPGQKGDAGSPGPQGLAGS 830 840 850 860 870 880 120 130 140 150 160 170 pF1KE4 GLPGPPGEAGVSGPPGGIGLRGPPGPSGLPGLPGPPGPPGPPGHPGVLPEGATDLQCPSI ::: : :: : :: : .:::: .:.:: : ::::: : CCDS33 --PGPHGPNGVPGLKGGRGTQGPPGATGFPGSAGRVGPPGPAG----------------- 890 900 910 920 180 190 200 210 220 230 pF1KE4 CPPGPPGPPGMPGFKGPTGYKGEQGEVGKDGEKGDPGPPGPAGLPGSVGLQGPRGLRGLP ::: :: : :: .:: : .:. : . :. :: :: :: : ::. .: : : : CCDS33 -APGPAGPLGEPGKEGPPGLRGDPG---SHGRVGDRGPAGPPGGPGD---KGDPGEDGQP 930 940 950 960 970 240 250 260 270 280 290 pF1KE4 GPLGPPGDRGPIGFRGPPGIPGAPGKAGDRGERGPEGFRGPKGDLGRPGPKGTPGVAGPS :: ::: :: : : :: : ::. ::::: :. ::: :::: .::. CCDS33 GPDGPP---GPAGTTGQRGIVGMPGQ---RGERGMPGL---------PGPAGTPGKVGPT 980 990 1000 1010 1020 300 310 320 330 340 350 pF1KE4 GEPGMPGKDGQNGVPGLDGQKGEAGRNGAPGEKGPNGLPGLPGRAGSKGEKGERGRAGEL : : : : : :: .: :: :: .:: : : ::: :. ::.:.:: : CCDS33 GATGDKGPPGPVGPPGSNGPVGE------PGPEGPAGNDGTPGRDGAVGERGDRGDPGPA 1030 1040 1050 1060 1070 360 370 380 390 400 410 pF1KE4 GEAGPSGEPGVPGDAGMPGERGEAGHRGSAGALGPQGPPGAPGVRGFQGQKGSMGDPGLP : : .: ::.:: .: ::. ::.::. :. :: :::: : :: :: CCDS33 GLPGSQGAPGTPGPVGAPGD---AGQRGDPGSRGPIGPPGRAGKRG------------LP 1080 1090 1100 1110 1120 420 430 440 450 460 470 pF1KE4 GPQGLRGDVGDRGPGGAAGPKGDQGIAGSDGLPGDKGELGPSGLVGPKGESGSRGELGPK :::: ::: ::.: : : :: .:..: .:::: : : : .: .: : .::. CCDS33 GPQGPRGDKGDHGDRGDRGQKGHRGFTGLQGLPG------PPGPNGEQGSAGIPGPFGPR 1130 1140 1150 1160 1170 480 490 500 510 520 530 pF1KE4 GTQGPNGTSGVQGVPGPPGPLGLQGVPGVPGITGKPGVPGKEASEQRIRELCGGMISEQI : :: : :: .: ::: ::.: : ::. :. : : : CCDS33 GPPGPVGPSGKEGNPGPLGPIG----P--PGVRGSVGEAGPE------------------ 1180 1190 1200 1210 540 550 560 570 580 pF1KE4 AQLAAHLRKPLAPGSIGRPGPAGPPGPPGPPGSI--------GHPGARGPPGYRGPTGEL : :: :::::::::: . :: : : . CCDS33 ----------------GPPGEPGPPGPPGPPGHLTAALGDIMGHYDESMPDPLPEFTEDQ 1220 1230 1240 1250 590 600 610 620 630 640 pF1KE4 GDPGPRGNQGDRGDKGAAGAGLDGPEGDQGPQGPQGVPGTSKDGQDGAPGEPGPPGDPGL . : . :. : : ... . . : ..:.: . :. . : CCDS33 AAPDDK-NKTDPGVHATLKSLSSQIETMRSPDGSKKHPARTCDDLKLCHSAKQSGEYWID 1260 1270 1280 1290 1300 1310 650 660 670 680 pF1KE4 PGAIGAQGTPGICDTSACQGAVLGGVGEKSGSRSS CCDS33 PNQGSVEDAIKVYCNMETGETCISANPSSVPRKTWWASKSPDNKPVWYGLDMNRGSQFAY 1320 1330 1340 1350 1360 1370 684 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sat Nov 5 23:56:22 2016 done: Sat Nov 5 23:56:23 2016 Total Scan time: 4.870 Total Display time: 0.320 Function used was FASTA [36.3.4 Apr, 2011]