FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB8291, 472 aa 1>>>pF1KB8291 472 - 472 aa - 472 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 8.3974+/-0.000829; mu= 6.1220+/- 0.051 mean_var=181.8947+/-36.448, 0's: 0 Z-trim(114.2): 42 B-trim: 89 in 1/52 Lambda= 0.095096 statistics sampled from 14763 (14804) to 14763 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.766), E-opt: 0.2 (0.455), width: 16 Scan time: 1.880 The best scores are: opt bits E(32554) CCDS41730.1 PKNOX2 gene_id:63876|Hs108|chr11 ( 472) 3133 441.6 8.2e-124 CCDS68211.1 PKNOX1 gene_id:5316|Hs108|chr21 ( 319) 951 142.2 7.9e-34 CCDS13692.1 PKNOX1 gene_id:5316|Hs108|chr21 ( 436) 794 120.7 3.1e-27 >>CCDS41730.1 PKNOX2 gene_id:63876|Hs108|chr11 (472 aa) initn: 3133 init1: 3133 opt: 3133 Z-score: 2336.3 bits: 441.6 E(32554): 8.2e-124 Smith-Waterman score: 3133; 100.0% identity (100.0% similar) in 472 aa overlap (1-472:1-472) 10 20 30 40 50 60 pF1KB8 MMQHASPAPALTMMATQNVPPPPYQDSPQMTATAQPPSKAQAVHISAPSAAASTPVPSAP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 MMQHASPAPALTMMATQNVPPPPYQDSPQMTATAQPPSKAQAVHISAPSAAASTPVPSAP 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB8 IDPQAQLEADKRAVYRHPLFPLLTLLFEKCEQATQGSECITSASFDVDIENFVHQQEQEH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 IDPQAQLEADKRAVYRHPLFPLLTLLFEKCEQATQGSECITSASFDVDIENFVHQQEQEH 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB8 KPFFSDDPELDNLMVKAIQVLRIHLLELEKVNELCKDFCNRYITCLKTKMHSDNLLRNDL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 KPFFSDDPELDNLMVKAIQVLRIHLLELEKVNELCKDFCNRYITCLKTKMHSDNLLRNDL 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB8 GGPYSPNQPSINLHSQDLLQNSPNSMSGVSNNPQGIVVPASALQQGNIAMTTVNSQVVSG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 GGPYSPNQPSINLHSQDLLQNSPNSMSGVSNNPQGIVVPASALQQGNIAMTTVNSQVVSG 190 200 210 220 230 240 250 260 270 280 290 300 pF1KB8 GALYQPVTMVTSQGQVVTQAIPQGAIQIQNTQVNLDLTSLLDNEDKKSKNKRGVLPKHAT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 GALYQPVTMVTSQGQVVTQAIPQGAIQIQNTQVNLDLTSLLDNEDKKSKNKRGVLPKHAT 250 260 270 280 290 300 310 320 330 340 350 360 pF1KB8 NIMRSWLFQHLMHPYPTEDEKRQIAAQTNLTLLQVNNWFINARRRILQPMLDASNPDPAP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 NIMRSWLFQHLMHPYPTEDEKRQIAAQTNLTLLQVNNWFINARRRILQPMLDASNPDPAP 310 320 330 340 350 360 370 380 390 400 410 420 pF1KB8 KAKKIKSQHRPTQRFWPNSIAAGVLQQQGGAPGTNPDGSINLDNLQSLSSDSATMAMQQA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 KAKKIKSQHRPTQRFWPNSIAAGVLQQQGGAPGTNPDGSINLDNLQSLSSDSATMAMQQA 370 380 390 400 410 420 430 440 450 460 470 pF1KB8 MMAAHDDSLDGTEEEDEDEMEEEEEEELEEEVDELQTTNVSDLGLEHSDSLE :::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 MMAAHDDSLDGTEEEDEDEMEEEEEEELEEEVDELQTTNVSDLGLEHSDSLE 430 440 450 460 470 >>CCDS68211.1 PKNOX1 gene_id:5316|Hs108|chr21 (319 aa) initn: 1110 init1: 457 opt: 951 Z-score: 720.8 bits: 142.2 E(32554): 7.9e-34 Smith-Waterman score: 1209; 58.5% identity (79.0% similar) in 352 aa overlap (134-472:1-319) 110 120 130 140 150 160 pF1KB8 SFDVDIENFVHQQEQEHKPFFSDDPELDNLMVKAIQVLRIHLLELEKVNELCKDFCNRYI ::::::::::::::::::::::::::.::: CCDS68 MVKAIQVLRIHLLELEKVNELCKDFCSRYI 10 20 30 170 180 190 200 210 220 pF1KB8 TCLKTKMHSDNLLRNDLGGPYSPNQPSINLHSQDLLQNSPNSMSGVSNNPQGIVVPASAL .::::::.:..:: .. :.:::: : ::.. ....: . .::::::::::: CCDS68 ACLKTKMNSETLLSGEPGSPYSPVQ------SQQI----QSAITG-TISPQGIVVPASAL 40 50 60 70 230 240 250 260 270 280 pF1KB8 QQGNIAMTTVNSQVVSGGALYQPVTMVTSQGQVVTQAIPQGAIQIQNTQVNLDLT---SL ::::.::.:: .::..:::::.:: :::::::.. :.:.:::.:..:.:. :. CCDS68 QQGNVAMATV-----AGGTVYQPVTVVTPQGQVVTQTLSPGTIRIQNSQLQLQLNQDLSI 80 90 100 110 120 130 290 300 310 320 330 340 pF1KB8 LDNEDKKSKNKRGVLPKHATNIMRSWLFQHLMHPYPTEDEKRQIAAQTNLTLLQVNNWFI : ..: .::::::::::::::.::::::::. :::::::::.:::::::::::::::::: CCDS68 LHQDDGSSKNKRGVLPKHATNVMRSWLFQHIGHPYPTEDEKKQIAAQTNLTLLQVNNWFI 140 150 160 170 180 190 350 360 370 380 390 pF1KB8 NARRRILQPMLDASNPDPAPKAKKIKSQHRPTQRFWPNSIAAGVLQ--------QQGGAP ::::::::::::.: . .::.:: .:.::.:::::.:::.:: : ..:.. CCDS68 NARRRILQPMLDSSCSE-TPKTKKKTAQNRPVQRFWPDSIASGVAQPPPSELTMSEGAVV 200 210 220 230 240 250 400 410 420 430 440 450 pF1KB8 GTNPDGSINLDNLQSLSSDSATMAMQQAMMAAH--DDSLDGTEEEDEDEMEEEEEEELEE . ..:.:.:::::::.::.:.::.:::.. :.:.:.::: CCDS68 TITTPVNMNVDSLQSLSSDGATLAVQQVMMAGQSEDESVDSTEE---------------- 260 270 280 290 460 470 pF1KB8 EVDELQTTNVSDLGLEHSDSLE .. : ...: : ::.::::. CCDS68 DAGALAPAHISGLVLENSDSLQ 300 310 >>CCDS13692.1 PKNOX1 gene_id:5316|Hs108|chr21 (436 aa) initn: 1540 init1: 675 opt: 794 Z-score: 602.5 bits: 120.7 E(32554): 3.1e-27 Smith-Waterman score: 1650; 58.1% identity (79.1% similar) in 473 aa overlap (13-472:1-436) 10 20 30 40 50 60 pF1KB8 MMQHASPAPALTMMATQNVPPPPYQDSPQMTATAQPPSKAQAVHISAPSAAASTPVPSAP :::::.. :::. :: .... .. : . : :.: . .: : : CCDS13 MMATQTLSIDSYQDGQQMQVVTELKTE-QDPNCSEPDAEGVSP-P--P 10 20 30 40 70 80 90 100 110 120 pF1KB8 IDPQAQLEADKRAVYRHPLFPLLTLLFEKCEQATQGSECITSASFDVDIENFVHQQEQEH .. :. ...::.:.:::::::::.::::::::.::::: :::::::::::::..::.: CCDS13 VESQTPMDVDKQAIYRHPLFPLLALLFEKCEQSTQGSEGTTSASFDVDIENFVRKQEKEG 50 60 70 80 90 100 130 140 150 160 170 180 pF1KB8 KPFFSDDPELDNLMVKAIQVLRIHLLELEKVNELCKDFCNRYITCLKTKMHSDNLLRNDL :::: .::: :::::::::::::::::::::::::::::.:::.::::::.:..:: .. CCDS13 KPFFCEDPETDNLMVKAIQVLRIHLLELEKVNELCKDFCSRYIACLKTKMNSETLLSGEP 110 120 130 140 150 160 190 200 210 220 230 240 pF1KB8 GGPYSPNQPSINLHSQDLLQNSPNSMSGVSNNPQGIVVPASALQQGNIAMTTVNSQVVSG :.:::: : ::.. ....: . .:::::::::::::::.::.:: .: CCDS13 GSPYSPVQ------SQQI----QSAITG-TISPQGIVVPASALQQGNVAMATV-----AG 170 180 190 200 250 260 270 280 290 pF1KB8 GALYQPVTMVTSQGQVVTQAIPQGAIQIQNTQVNLDLT---SLLDNEDKKSKNKRGVLPK :..:::::.:: :::::::.. :.:.:::.:..:.:. :.: ..: .:::::::::: CCDS13 GTVYQPVTVVTPQGQVVTQTLSPGTIRIQNSQLQLQLNQDLSILHQDDGSSKNKRGVLPK 210 220 230 240 250 260 300 310 320 330 340 350 pF1KB8 HATNIMRSWLFQHLMHPYPTEDEKRQIAAQTNLTLLQVNNWFINARRRILQPMLDASNPD ::::.::::::::. :::::::::.::::::::::::::::::::::::::::::.: . CCDS13 HATNVMRSWLFQHIGHPYPTEDEKKQIAAQTNLTLLQVNNWFINARRRILQPMLDSSCSE 270 280 290 300 310 320 360 370 380 390 400 pF1KB8 PAPKAKKIKSQHRPTQRFWPNSIAAGVLQ--------QQGGAPGTNPDGSINLDNLQSLS .::.:: .:.::.:::::.:::.:: : ..:.. . ..:.:.::::: CCDS13 -TPKTKKKTAQNRPVQRFWPDSIASGVAQPPPSELTMSEGAVVTITTPVNMNVDSLQSLS 330 340 350 360 370 380 410 420 430 440 450 460 pF1KB8 SDSATMAMQQAMMAAH--DDSLDGTEEEDEDEMEEEEEEELEEEVDELQTTNVSDLGLEH ::.::.:.::.:::.. :.:.:.:::. . : ...: : ::. CCDS13 SDGATLAVQQVMMAGQSEDESVDSTEED----------------AGALAPAHISGLVLEN 390 400 410 420 430 470 pF1KB8 SDSLE ::::. CCDS13 SDSLQ 472 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Fri Nov 4 11:26:26 2016 done: Fri Nov 4 11:26:27 2016 Total Scan time: 1.880 Total Display time: -0.010 Function used was FASTA [36.3.4 Apr, 2011]