FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE2278, 342 aa 1>>>pF1KE2278 342 - 342 aa - 342 aa Library: /omim/omim.rfq.tfa 60827320 residues in 85289 sequences Statistics: Expectation_n fit: rho(ln(x))= 6.0112+/-0.000419; mu= 12.3517+/- 0.026 mean_var=84.5454+/-17.392, 0's: 0 Z-trim(111.4): 127 B-trim: 0 in 0/52 Lambda= 0.139486 statistics sampled from 19858 (19986) to 19858 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.596), E-opt: 0.2 (0.234), width: 16 Scan time: 6.970 The best scores are: opt bits E(85289) NP_005819 (OMIM: 605973) DDB1- and CUL4-associated ( 342) 2393 491.8 9.1e-139 NP_113673 (OMIM: 610597) glutamate-rich WD repeat- ( 446) 196 49.7 1.4e-05 NP_001128728 (OMIM: 602923) histone-binding protei ( 390) 155 41.4 0.0038 NP_001287673 (OMIM: 610257) protein transport prot (1067) 161 42.9 0.0038 NP_001070674 (OMIM: 610257) protein transport prot (1106) 161 42.9 0.004 NP_001128727 (OMIM: 602923) histone-binding protei ( 424) 155 41.5 0.0041 NP_005601 (OMIM: 602923) histone-binding protein R ( 425) 155 41.5 0.0041 NP_001287674 (OMIM: 610257) protein transport prot (1166) 161 42.9 0.0041 NP_001177978 (OMIM: 610257) protein transport prot (1200) 161 42.9 0.0042 NP_056305 (OMIM: 610258) protein transport protein (1179) 156 41.9 0.0084 >>NP_005819 (OMIM: 605973) DDB1- and CUL4-associated fac (342 aa) initn: 2393 init1: 2393 opt: 2393 Z-score: 2612.3 bits: 491.8 E(85289): 9.1e-139 Smith-Waterman score: 2393; 100.0% identity (100.0% similar) in 342 aa overlap (1-342:1-342) 10 20 30 40 50 60 pF1KE2 MSLHGKRKEIYKYEAPWTVYAMNWSVRPDKRFRLALGSFVEEYNNKVQLVGLDEESSEFI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: NP_005 MSLHGKRKEIYKYEAPWTVYAMNWSVRPDKRFRLALGSFVEEYNNKVQLVGLDEESSEFI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE2 CRNTFDHPYPTTKLMWIPDTKGVYPDLLATSGDYLRVWRVGETETRLECLLNNNKNSDFC :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: NP_005 CRNTFDHPYPTTKLMWIPDTKGVYPDLLATSGDYLRVWRVGETETRLECLLNNNKNSDFC 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE2 APLTSFDWNEVDPYLLGTSSIDTTCTIWGLETGQVLGRVNLVSGHVKTQLIAHDKEVYDI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: NP_005 APLTSFDWNEVDPYLLGTSSIDTTCTIWGLETGQVLGRVNLVSGHVKTQLIAHDKEVYDI 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE2 AFSRAGGGRDMFASVGADGSVRMFDLRHLEHSTIIYEDPQHHPLLRLCWNKQDPNYLATM :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: NP_005 AFSRAGGGRDMFASVGADGSVRMFDLRHLEHSTIIYEDPQHHPLLRLCWNKQDPNYLATM 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE2 AMDGMEVVILDVRVPCTPVARLNNHRACVNGIAWAPHSSCHICTAADDHQALIWDIQQMP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: NP_005 AMDGMEVVILDVRVPCTPVARLNNHRACVNGIAWAPHSSCHICTAADDHQALIWDIQQMP 250 260 270 280 290 300 310 320 330 340 pF1KE2 RAIEDPILAYTAEGEINNVQWASTQPDWIAICYNNCLEILRV :::::::::::::::::::::::::::::::::::::::::: NP_005 RAIEDPILAYTAEGEINNVQWASTQPDWIAICYNNCLEILRV 310 320 330 340 >>NP_113673 (OMIM: 610597) glutamate-rich WD repeat-cont (446 aa) initn: 213 init1: 101 opt: 196 Z-score: 221.2 bits: 49.7 E(85289): 1.4e-05 Smith-Waterman score: 205; 24.4% identity (52.6% similar) in 312 aa overlap (51-326:121-427) 30 40 50 60 70 80 pF1KE2 AMNWSVRPDKRFRLALGSFVEEYNNKVQLVGLDEESSEFICRNTFDHPYPTTKLMWIPDT : ::: : .. .. : .: .: NP_113 AGTQAESAQSNRLMMLRMHNLHGTKPPPSEGSDEEEEEEDEEDEEERK-PQLELAMVPHY 100 110 120 130 140 90 100 110 120 130 pF1KE2 KGVYPDLLATSGD--YLRVW-RVGETET-RLECLLNNNKNSDFCAPLTSFDWNEVDPYL- :. .. :. :: . :..:. :. ::. .. . : . . .. : . NP_113 GGINRVRVSWLGEEPVAGVWSEKGQVEVFALRRLLQVVEEPQALAAFLRDEQAQMKPIFS 150 160 170 180 190 200 140 150 160 170 180 pF1KE2 ----LGTS-SIDTTCTIWG-LETGQVLGRVNLV------SGHV-KTQLIAHDKEVYDIAF .: . ..: . . : : ::. ..: : :: . ...: . : :. . NP_113 FAGHMGEGFALDWSPRVTGRLLTGDCQKNIHLWTPTDGGSWHVDQRPFVGHTRSVEDLQW 210 220 230 240 250 260 190 200 210 220 230 240 pF1KE2 SRAGGGRDMFASVGADGSVRMFDLRHLEHSTIIYEDPQHHP--LLRLCWNKQDPNYLATM : . . .::: .::.:.:..:.: .. . : . . :....: .: . NP_113 SPTEN--TVFASCSADASIRIWDIRAAPSKACMLTTATAHDGDVNVISWSRREP-FLLSG 270 280 290 300 310 320 250 260 270 280 290 pF1KE2 AMDGMEVVILDVRV--PCTPVARLNNHRACVNGIAWAPHSSCHICTAADDHQALIWD--I . :: . : :.: .::: ...: : :... : :..: . ... ::: :: . NP_113 GDDGA-LKIWDLRQFKSGSPVATFKQHVAPVTSVEWHPQDSGVFAASGADHQITQWDLAV 330 340 350 360 370 380 300 310 320 330 340 pF1KE2 QQMPRA--IE-DPILA---------YTAEGEINNVQWASTQPDWIAICYNNCLEILRV .. :.: .: :: :: . .: :.....: : NP_113 ERDPEAGDVEADPGLADLPQQLLFVHQGETELKELHWHPQCPGLLVSTALSGFTIFRTIS 390 400 410 420 430 440 NP_113 V >>NP_001128728 (OMIM: 602923) histone-binding protein RB (390 aa) initn: 223 init1: 130 opt: 155 Z-score: 177.4 bits: 41.4 E(85289): 0.0038 Smith-Waterman score: 158; 28.4% identity (57.8% similar) in 102 aa overlap (232-321:100-200) 210 220 230 240 250 pF1KE2 RMFDLRHLEHSTIIYEDPQHHPLLRLCWNKQDPNYLATMAMDGMEVVILD-VRVP----- :.: .:: . .:...: .. : NP_001 FGGFGSVSGKIEIEIKINHEGEVNRARYMPQNPCIIAT-KTPSSDVLVFDYTKHPSKPDP 70 80 90 100 110 120 260 270 280 290 300 310 pF1KE2 ---CTPVARLNNHRACVNGIAWAPHSSCHICTAADDHQALIWDIQQMPRA--IEDPILAY :.: :: .:. :..: :. : :. .:.::: .:::. .:. . : . NP_001 SGECNPDLRLRGHQKEGYGLSWNPNLSGHLLSASDDHTICLWDISAVPKEGKVVDAKTIF 130 140 150 160 170 180 320 330 340 pF1KE2 TAEGEI-NNVQWASTQPDWIAICYNNCLEILRV :.. . ..:.: NP_001 TGHTAVVEDVSWHLLHESLFGSVADDQKLMIWDTRSNNTSKPSHSVDAHTAEVNCLSFNP 190 200 210 220 230 240 >>NP_001287673 (OMIM: 610257) protein transport protein (1067 aa) initn: 83 init1: 73 opt: 161 Z-score: 177.4 bits: 42.9 E(85289): 0.0038 Smith-Waterman score: 163; 21.5% identity (55.3% similar) in 275 aa overlap (76-332:14-279) 50 60 70 80 90 100 pF1KE2 KVQLVGLDEESSEFICRNTFDHPYPTTKLMWIPDTKGVYPDLLATSGDYLRVWRVGETET : : . .: :::. . .. . :.. NP_001 MKLKEVDRTAMQAWSPAQN--HPIYLATGTSAQQLDATFSTNA 10 20 30 40 110 120 130 140 150 pF1KE2 RLECL-LN-NNKNSDF--CAPLTS-FDWNEV--DPYLLGTSSIDTTCTIWGLETGQVL-- :: . :. .. . :. :: ..: .... :: . ... . : : :.:... NP_001 SLEIFELDLSDPSLDMKSCATFSSSHRYHKLIWGPYKMDSKGDVSGVLIAGGENGNIILY 50 60 70 80 90 100 160 170 180 190 200 210 pF1KE2 GRVNLVSGHVKTQLIAHDKEVYDI-AFSRAGGGRDMFASVGADGSVRMFDLRHLEHSTII ....: .. . .::.. . :.. .. :: . .. . ..:: .. .: . NP_001 DPSKIIAGDKEVVIAQNDKHTGPVRALDVNIFQTNLVASGANESEIYIWDLNNF--ATPM 110 120 130 140 150 220 230 240 250 260 270 pF1KE2 YEDPQHHPLLRL-C--WNKQDPNYLATMAMDGMEVVILDVRVPCTPVARLNNH--RACVN . .: . : ::.: . ::. . .: .... :.: :. ....: : . NP_001 TPGAKTQPPEDISCIAWNRQVQHILASASPSG-RATVWDLR-KNEPIIKVSDHSNRMHCS 160 170 180 190 200 210 280 290 300 310 320 pF1KE2 GIAWAPHSSCHICTAA-DDHQALI--WDIQQMPRAIEDPILAYTAEGEINNVQWASTQPD :.:: : . .. :. ::. .: ::.. .. .: :.: : . :. ..:. NP_001 GLAWHPDVATQMVLASEDDRLPVIQMWDLRFASSPLR--VLENHARG-ILAIAWSMADPE 220 230 240 250 260 270 330 340 pF1KE2 WIAICYNNCLEILRV . : NP_001 LLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSAASFDGRISVYSIM 280 290 300 310 320 330 >>NP_001070674 (OMIM: 610257) protein transport protein (1106 aa) initn: 83 init1: 73 opt: 161 Z-score: 177.2 bits: 42.9 E(85289): 0.004 Smith-Waterman score: 163; 21.5% identity (55.3% similar) in 275 aa overlap (76-332:14-279) 50 60 70 80 90 100 pF1KE2 KVQLVGLDEESSEFICRNTFDHPYPTTKLMWIPDTKGVYPDLLATSGDYLRVWRVGETET : : . .: :::. . .. . :.. NP_001 MKLKEVDRTAMQAWSPAQN--HPIYLATGTSAQQLDATFSTNA 10 20 30 40 110 120 130 140 150 pF1KE2 RLECL-LN-NNKNSDF--CAPLTS-FDWNEV--DPYLLGTSSIDTTCTIWGLETGQVL-- :: . :. .. . :. :: ..: .... :: . ... . : : :.:... NP_001 SLEIFELDLSDPSLDMKSCATFSSSHRYHKLIWGPYKMDSKGDVSGVLIAGGENGNIILY 50 60 70 80 90 100 160 170 180 190 200 210 pF1KE2 GRVNLVSGHVKTQLIAHDKEVYDI-AFSRAGGGRDMFASVGADGSVRMFDLRHLEHSTII ....: .. . .::.. . :.. .. :: . .. . ..:: .. .: . NP_001 DPSKIIAGDKEVVIAQNDKHTGPVRALDVNIFQTNLVASGANESEIYIWDLNNF--ATPM 110 120 130 140 150 220 230 240 250 260 270 pF1KE2 YEDPQHHPLLRL-C--WNKQDPNYLATMAMDGMEVVILDVRVPCTPVARLNNH--RACVN . .: . : ::.: . ::. . .: .... :.: :. ....: : . NP_001 TPGAKTQPPEDISCIAWNRQVQHILASASPSG-RATVWDLR-KNEPIIKVSDHSNRMHCS 160 170 180 190 200 210 280 290 300 310 320 pF1KE2 GIAWAPHSSCHICTAA-DDHQALI--WDIQQMPRAIEDPILAYTAEGEINNVQWASTQPD :.:: : . .. :. ::. .: ::.. .. .: :.: : . :. ..:. NP_001 GLAWHPDVATQMVLASEDDRLPVIQMWDLRFASSPLR--VLENHARG-ILAIAWSMADPE 220 230 240 250 260 270 330 340 pF1KE2 WIAICYNNCLEILRV . : NP_001 LLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSAASFDGRISVYSIM 280 290 300 310 320 330 >>NP_001128727 (OMIM: 602923) histone-binding protein RB (424 aa) initn: 223 init1: 130 opt: 155 Z-score: 176.9 bits: 41.5 E(85289): 0.0041 Smith-Waterman score: 158; 28.4% identity (57.8% similar) in 102 aa overlap (232-321:134-234) 210 220 230 240 250 pF1KE2 RMFDLRHLEHSTIIYEDPQHHPLLRLCWNKQDPNYLATMAMDGMEVVILD-VRVP----- :.: .:: . .:...: .. : NP_001 FGGFGSVSGKIEIEIKINHEGEVNRARYMPQNPCIIAT-KTPSSDVLVFDYTKHPSKPDP 110 120 130 140 150 160 260 270 280 290 300 310 pF1KE2 ---CTPVARLNNHRACVNGIAWAPHSSCHICTAADDHQALIWDIQQMPRA--IEDPILAY :.: :: .:. :..: :. : :. .:.::: .:::. .:. . : . NP_001 SGECNPDLRLRGHQKEGYGLSWNPNLSGHLLSASDDHTICLWDISAVPKEGKVVDAKTIF 170 180 190 200 210 220 320 330 340 pF1KE2 TAEGEI-NNVQWASTQPDWIAICYNNCLEILRV :.. . ..:.: NP_001 TGHTAVVEDVSWHLLHESLFGSVADDQKLMIWDTRSNNTSKPSHSVDAHTAEVNCLSFNP 230 240 250 260 270 280 >>NP_005601 (OMIM: 602923) histone-binding protein RBBP4 (425 aa) initn: 223 init1: 130 opt: 155 Z-score: 176.9 bits: 41.5 E(85289): 0.0041 Smith-Waterman score: 158; 28.4% identity (57.8% similar) in 102 aa overlap (232-321:135-235) 210 220 230 240 250 pF1KE2 RMFDLRHLEHSTIIYEDPQHHPLLRLCWNKQDPNYLATMAMDGMEVVILD-VRVP----- :.: .:: . .:...: .. : NP_005 FGGFGSVSGKIEIEIKINHEGEVNRARYMPQNPCIIAT-KTPSSDVLVFDYTKHPSKPDP 110 120 130 140 150 160 260 270 280 290 300 310 pF1KE2 ---CTPVARLNNHRACVNGIAWAPHSSCHICTAADDHQALIWDIQQMPRA--IEDPILAY :.: :: .:. :..: :. : :. .:.::: .:::. .:. . : . NP_005 SGECNPDLRLRGHQKEGYGLSWNPNLSGHLLSASDDHTICLWDISAVPKEGKVVDAKTIF 170 180 190 200 210 220 320 330 340 pF1KE2 TAEGEI-NNVQWASTQPDWIAICYNNCLEILRV :.. . ..:.: NP_005 TGHTAVVEDVSWHLLHESLFGSVADDQKLMIWDTRSNNTSKPSHSVDAHTAEVNCLSFNP 230 240 250 260 270 280 >>NP_001287674 (OMIM: 610257) protein transport protein (1166 aa) initn: 83 init1: 73 opt: 161 Z-score: 176.8 bits: 42.9 E(85289): 0.0041 Smith-Waterman score: 163; 21.5% identity (55.3% similar) in 275 aa overlap (76-332:14-279) 50 60 70 80 90 100 pF1KE2 KVQLVGLDEESSEFICRNTFDHPYPTTKLMWIPDTKGVYPDLLATSGDYLRVWRVGETET : : . .: :::. . .. . :.. NP_001 MKLKEVDRTAMQAWSPAQN--HPIYLATGTSAQQLDATFSTNA 10 20 30 40 110 120 130 140 150 pF1KE2 RLECL-LN-NNKNSDF--CAPLTS-FDWNEV--DPYLLGTSSIDTTCTIWGLETGQVL-- :: . :. .. . :. :: ..: .... :: . ... . : : :.:... NP_001 SLEIFELDLSDPSLDMKSCATFSSSHRYHKLIWGPYKMDSKGDVSGVLIAGGENGNIILY 50 60 70 80 90 100 160 170 180 190 200 210 pF1KE2 GRVNLVSGHVKTQLIAHDKEVYDI-AFSRAGGGRDMFASVGADGSVRMFDLRHLEHSTII ....: .. . .::.. . :.. .. :: . .. . ..:: .. .: . NP_001 DPSKIIAGDKEVVIAQNDKHTGPVRALDVNIFQTNLVASGANESEIYIWDLNNF--ATPM 110 120 130 140 150 220 230 240 250 260 270 pF1KE2 YEDPQHHPLLRL-C--WNKQDPNYLATMAMDGMEVVILDVRVPCTPVARLNNH--RACVN . .: . : ::.: . ::. . .: .... :.: :. ....: : . NP_001 TPGAKTQPPEDISCIAWNRQVQHILASASPSG-RATVWDLR-KNEPIIKVSDHSNRMHCS 160 170 180 190 200 210 280 290 300 310 320 pF1KE2 GIAWAPHSSCHICTAA-DDHQALI--WDIQQMPRAIEDPILAYTAEGEINNVQWASTQPD :.:: : . .. :. ::. .: ::.. .. .: :.: : . :. ..:. NP_001 GLAWHPDVATQMVLASEDDRLPVIQMWDLRFASSPLR--VLENHARG-ILAIAWSMADPE 220 230 240 250 260 270 330 340 pF1KE2 WIAICYNNCLEILRV . : NP_001 LLLSCGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSAASFDGRISVYSIM 280 290 300 310 320 330 >>NP_001177978 (OMIM: 610257) protein transport protein (1200 aa) initn: 83 init1: 73 opt: 161 Z-score: 176.6 bits: 42.9 E(85289): 0.0042 Smith-Waterman score: 161; 20.4% identity (55.0% similar) in 211 aa overlap (133-332:71-274) 110 120 130 140 150 160 pF1KE2 TETRLECLLNNNKNSDFCAPLTSFDWNEVDPYLLGTSSIDTTCTIWGLETGQVL--GRVN :: . ... . : : :.:... . NP_001 FELDLSDPSLDMKSCATFSSSHRYHKLIWGPYKMDSKGDVSGVLIAGGENGNIILYDPSK 50 60 70 80 90 100 170 180 190 200 210 pF1KE2 LVSGHVKTQLIAHDKEVYDI-AFSRAGGGRDMFASVGADGSVRMFDLRHLEHSTIIYEDP ...: .. . .::.. . :.. .. :: . .. . ..:: .. .: . NP_001 IIAGDKEVVIAQNDKHTGPVRALDVNIFQTNLVASGANESEIYIWDLNNF--ATPMTPGA 110 120 130 140 150 220 230 240 250 260 270 pF1KE2 QHHPLLRL-C--WNKQDPNYLATMAMDGMEVVILDVRVPCTPVARLNNH--RACVNGIAW . .: . : ::.: . ::. . .: .... :.: :. ....: : .:.:: NP_001 KTQPPEDISCIAWNRQVQHILASASPSG-RATVWDLR-KNEPIIKVSDHSNRMHCSGLAW 160 170 180 190 200 210 280 290 300 310 320 330 pF1KE2 APHSSCHICTAADDHQ---ALIWDIQQMPRAIEDPILAYTAEGEINNVQWASTQPDWIAI : . .. :..: . .::.. .. .: :.: : . :. ..:. . NP_001 HPDVATQMVLASEDDRLPVIQMWDLRFASSPLR--VLENHARG-ILAIAWSMADPELLLS 220 230 240 250 260 270 340 pF1KE2 CYNNCLEILRV : NP_001 CGKDAKILCSNPNTGEVLYELPTNTQWCFDIQWCPRNPAVLSAASFDGRISVYSIMGGST 280 290 300 310 320 330 >>NP_056305 (OMIM: 610258) protein transport protein Sec (1179 aa) initn: 66 init1: 66 opt: 156 Z-score: 171.3 bits: 41.9 E(85289): 0.0084 Smith-Waterman score: 157; 22.9% identity (50.9% similar) in 275 aa overlap (76-327:14-272) 50 60 70 80 90 100 pF1KE2 KVQLVGLDEESSEFICRNTFDHPYPTTKLMWIPDTKGVYPDLLATSGDYLRVWRVGETET : : .. :: :::. . .. :. NP_056 MKLKELERPAVQAWSPASQ--YPLYLATGTSAQQLDSSFSTNG 10 20 30 40 110 120 130 140 150 pF1KE2 RLECLLNNNKNSDF-------CAPLTSFD---WNEVDPYLLGTSSIDTTCTIWGLETGQ- :: . . .. .. . :. : :. :: .:.. . : ..:. NP_056 TLEIFEVDFRDPSLDLKHRGVLSALSRFHKLVWGSFGSGLLESSGV----IVGGGDNGML 50 60 70 80 90 160 170 180 190 200 210 pF1KE2 VLGRVNLVSGHVKTQLIA----HDKEVYDIAFSRAGGGRDMFASVGADGSVRMFDLRHLE .: :. . . : .:: : : . .. : ...:: ..:. . ..:: .:. NP_056 ILYNVTHILSSGKEPVIAQKQKHTGAVRALDLNPFQG--NLLASGASDSEIFIWDLNNLN 100 110 120 130 140 150 220 230 240 250 260 pF1KE2 HSTIIYEDPQHHP--LLRLCWNKQDPNYLATMAMDGMEVVILDVRVPCTPVARLNNH--R . :. : . : ::.: . :.. .: ..:. :.: :. ....: : NP_056 VPMTLGSKSQQPPEDIKALSWNRQAQHILSSAHPSG-KAVVWDLR-KNEPIIKVSDHSNR 160 170 180 190 200 210 270 280 290 300 310 320 pF1KE2 ACVNGIAWAPHSSCHI--CTAADDHQALI--WDIQQMPRAIEDPILAYTAEGEINNVQWA .:.:: : . .. :. ::. .: ::.. .. .: ..: : .:.: NP_056 MHCSGLAWHPDIATQLVLCSE-DDRLPVIQLWDLRFASSPLK--VLESHSRG-ILSVSW- 220 230 240 250 260 330 340 pF1KE2 STQPDWIAICYNNCLEILRV .: : NP_056 -SQADAELLLTSAKDSQILCRNLGSSEVVYKLPTQSSWCFDVQWCPRDPSVFSAASFNGW 270 280 290 300 310 320 342 residues in 1 query sequences 60827320 residues in 85289 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Mon Nov 7 02:52:17 2016 done: Mon Nov 7 02:52:18 2016 Total Scan time: 6.970 Total Display time: 0.020 Function used was FASTA [36.3.4 Apr, 2011]