Miyakogusa Predicted Gene
- Lj5g3v2264220.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj5g3v2264220.1 Non Chatacterized Hit- tr|I1IJ83|I1IJ83_BRADI
Uncharacterized protein OS=Brachypodium distachyon
GN=,40,1e-16,UNCHARACTERIZED,Harbinger transposase-derived nuclease;
DDE_4,NULL,CUFF.57130.1
(382 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G55350.1 | Symbols: | PIF / Ping-Pong family of plant transp... 442 e-124
AT3G63270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative h... 340 1e-93
AT5G12010.1 | Symbols: | unknown protein; INVOLVED IN: response... 135 5e-32
AT4G29780.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 117 1e-26
AT1G72270.2 | Symbols: | LOCATED IN: mitochondrion; EXPRESSED I... 107 1e-23
AT1G72270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Ribosome 6... 106 3e-23
AT3G19120.1 | Symbols: | PIF / Ping-Pong family of plant transp... 87 2e-17
AT5G41980.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative h... 63 4e-10
>AT3G55350.1 | Symbols: | PIF / Ping-Pong family of plant
transposases | chr3:20518518-20520690 FORWARD LENGTH=406
Length = 406
Score = 442 bits (1137), Expect = e-124, Method: Compositional matrix adjust.
Identities = 214/358 (59%), Positives = 265/358 (74%), Gaps = 8/358 (2%)
Query: 27 QPSDWWHHFSHRISGPLAQSKDIGKFESVLKISRKTFNYICSLVEKDMLARSC--VDLNG 84
Q DWW FS RI G S D FESV KISRKTF+YICSLV+ D A+ D NG
Sbjct: 50 QSLDWWDGFSRRIYG---GSTDPKTFESVFKISRKTFDYICSLVKADFTAKPANFSDSNG 106
Query: 85 NHLSLNDQVAVALRRLSSGESLSTIGDSFLMNQSAVSQVTWLFVEAMEERGLHHLSWPST 144
N LSLND+VAVALRRL SGESLS IG++F MNQS VSQ+TW FVE+MEER +HHLSWPS
Sbjct: 107 NPLSLNDRVAVALRRLGSGESLSVIGETFGMNQSTVSQITWRFVESMEERAIHHLSWPS- 165
Query: 145 ETAMEEIKFKFENIRGLSNCCGAVDSTHILMTLPSGDTENSVWLDRKKNCSMILQAIVDP 204
++EIK KFE I GL NCCGA+D THI+M LP+ + N VWLD +KN SM LQA+VDP
Sbjct: 166 --KLDEIKSKFEKISGLPNCCGAIDITHIVMNLPAVEPSNKVWLDGEKNFSMTLQAVVDP 223
Query: 205 DLRFRDVVGGWPGSLSDEYVLRHSEFFKLAEEGKRLNGAEKMLPEGTALREYIIGDTGFP 264
D+RF DV+ GWPGSL+D+ VL++S F+KL E+GKRLNG + L E T LREYI+GD+GFP
Sbjct: 224 DMRFLDVIAGWPGSLNDDVVLKNSGFYKLVEKGKRLNGEKLPLSERTELREYIVGDSGFP 283
Query: 265 LLPWLLTPYECKDLSDVEVEFNKRVVATHMVAKRALARLKQMWKIIQGVMWKPDKHKLPR 324
LLPWLLTPY+ K S + EFNKR A+ AL++LK W+II GVMW PD+++LPR
Sbjct: 284 LLPWLLTPYQGKPTSLPQTEFNKRHSEATKAAQMALSKLKDRWRIINGVMWMPDRNRLPR 343
Query: 325 IVLVCCILHNIVIDMEDEVMDEVPLCPQHDSGYQDQTCEFADNTAYTMREKLSLHLSG 382
I+ VCC+LHNI+IDMED+ +D+ PL QHD Y+ ++C+ AD + +R++LS L G
Sbjct: 344 IIFVCCLLHNIIIDMEDQTLDDQPLSQQHDMNYRQRSCKLADEASSVLRDELSDQLCG 401
>AT3G63270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative
harbinger transposase-derived nuclease
(InterPro:IPR006912); BEST Arabidopsis thaliana protein
match is: PIF / Ping-Pong family of plant transposases
(TAIR:AT3G55350.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr3:23375932-23377398 REVERSE LENGTH=396
Length = 396
Score = 340 bits (871), Expect = 1e-93, Method: Compositional matrix adjust.
Identities = 164/354 (46%), Positives = 233/354 (65%), Gaps = 5/354 (1%)
Query: 30 DWWHHFSHRISGPLAQSKDIGKFESVLKISRKTFNYICSLVEKDMLAR---SCVDLNGNH 86
DWW F R S P S + F+ + S+ TF+YICSLV +D+++R +++ G
Sbjct: 43 DWWDTFWLRNSSPSVPSDEDYAFKHFFRASKTTFSYICSLVREDLISRPPSGLINIEGRL 102
Query: 87 LSLNDQVAVALRRLSSGESLSTIGDSFLMNQSAVSQVTWLFVEAMEERGLHHLSWPSTET 146
LS+ QVA+ALRRL+SG+S ++G +F + QS VSQVTW F+EA+EER HHL WP ++
Sbjct: 103 LSVEKQVAIALRRLASGDSQVSVGAAFGVGQSTVSQVTWRFIEALEERAKHHLRWPDSDR 162
Query: 147 AMEEIKFKFENIRGLSNCCGAVDSTHILMTLPSGDTENSVWLDRKKNCSMILQAIVDPDL 206
+EEIK KFE + GL NCCGA+D+THI+MTLP+ + W D++KN SM LQ + D ++
Sbjct: 163 -IEEIKSKFEEMYGLPNCCGAIDTTHIIMTLPAVQASDD-WCDQEKNYSMFLQGVFDHEM 220
Query: 207 RFRDVVGGWPGSLSDEYVLRHSEFFKLAEEGKRLNGAEKMLPEGTALREYIIGDTGFPLL 266
RF ++V GWPG ++ +L+ S FFKL E + L+G K L +G +REY++G +PLL
Sbjct: 221 RFLNMVTGWPGGMTVSKLLKFSGFFKLCENAQILDGNPKTLSQGAQIREYVVGGISYPLL 280
Query: 267 PWLLTPYECKDLSDVEVEFNKRVVATHMVAKRALARLKQMWKIIQGVMWKPDKHKLPRIV 326
PWL+TP++ SD V FN+R VA A +LK W+I+ VMW+PD+ KLP I+
Sbjct: 281 PWLITPHDSDHPSDSMVAFNERHEKVRSVAATAFQQLKGSWRILSKVMWRPDRRKLPSII 340
Query: 327 LVCCILHNIVIDMEDEVMDEVPLCPQHDSGYQDQTCEFADNTAYTMREKLSLHL 380
LVCC+LHNI+ID D + ++VPL HDSGY D+ C+ + +R L+ HL
Sbjct: 341 LVCCLLHNIIIDCGDYLQEDVPLSGHHDSGYADRYCKQTEPLGSELRGCLTEHL 394
>AT5G12010.1 | Symbols: | unknown protein; INVOLVED IN: response to
salt stress; LOCATED IN: chloroplast, plasma membrane,
membrane; EXPRESSED IN: 23 plant structures; EXPRESSED
DURING: 13 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT4G29780.1);
Has 1807 Blast hits to 1807 proteins in 277 species:
Archae - 0; Bacteria - 0; Metazoa - 736; Fungi - 347;
Plants - 385; Viruses - 0; Other Eukaryotes - 339
(source: NCBI BLink). | chr5:3877975-3879483 REVERSE
LENGTH=502
Length = 502
Score = 135 bits (340), Expect = 5e-32, Method: Compositional matrix adjust.
Identities = 96/327 (29%), Positives = 160/327 (48%), Gaps = 33/327 (10%)
Query: 31 WWHHFSHRISGPLAQSKDIGKFESVLKISRKTFNYICSLVEKDMLARSCVDLNGNHLSLN 90
WW S R+ P F+ ++S+ TF IC + +A+ L N + +
Sbjct: 161 WWEECS-RLDYPEED------FKKAFRMSKSTFELICDEL-NSAVAKEDTALR-NAIPVR 211
Query: 91 DQVAVALRRLSSGESLSTIGDSFLMNQSAVSQVTWLFVEAMEERGL-HHLSWPSTETAME 149
+VAV + RL++GE L + F + S ++ +A+++ + +L WP E+ +
Sbjct: 212 QRVAVCIWRLATGEPLRLVSKKFGLGISTCHKLVLEVCKAIKDVLMPKYLQWPDDES-LR 270
Query: 150 EIKFKFENIRGLSNCCGAVDSTHILMTLP-----SGDTENSVWLDRKKNCSMILQAIVDP 204
I+ +FE++ G+ N G++ +THI + P S + ++K + S+ +QA+V+P
Sbjct: 271 NIRERFESVSGIPNVVGSMYTTHIPIIAPKISVASYFNKRHTERNQKTSYSITIQAVVNP 330
Query: 205 DLRFRDVVGGWPGSLSDEYVLRHSEFFKLAEEGKRLNGAEKMLPEGTALREYIIGDTGFP 264
F D+ GWPGS+ D+ VL S ++ A G L G ++ G G P
Sbjct: 331 KGVFTDLCIGWPGSMPDDKVLEKSLLYQRANNGGLLKGM------------WVAGGPGHP 378
Query: 265 LLPWLLTPYECKDLSDVEVEFNKRVVATHMVAKRALARLKQMWKIIQGVMWKPDKHKLPR 324
LL W+L PY ++L+ + FN+++ VAK A RLK W +Q + LP
Sbjct: 379 LLDWVLVPYTQQNLTWTQHAFNEKMSEVQGVAKEAFGRLKGRWACLQKRT-EVKLQDLPT 437
Query: 325 IVLVCCILHNIV----IDMEDEVMDEV 347
++ CC+LHNI ME E+M EV
Sbjct: 438 VLGACCVLHNICEMREEKMEPELMVEV 464
>AT4G29780.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT5G12010.1); Has 945 Blast hits to 944 proteins
in 87 species: Archae - 0; Bacteria - 0; Metazoa - 519;
Fungi - 43; Plants - 365; Viruses - 0; Other Eukaryotes
- 18 (source: NCBI BLink). | chr4:14579859-14581481
FORWARD LENGTH=540
Length = 540
Score = 117 bits (293), Expect = 1e-26, Method: Compositional matrix adjust.
Identities = 88/330 (26%), Positives = 157/330 (47%), Gaps = 39/330 (11%)
Query: 29 SDWWHHFSHRISGPLAQSKDIGKFESVLKISRKTFNYIC-----SLVEKDMLARSCVDLN 83
+DWW R+S P + F ++S+ TFN IC ++ +K+ + R +
Sbjct: 197 TDWWD----RVSRPDFPEDE---FRREFRMSKSTFNLICEELDTTVTKKNTMLRDAI--- 246
Query: 84 GNHLSLNDQVAVALRRLSSGESLSTIGDSFLMNQSAVSQVTWLFVEAMEERGL-HHLSWP 142
+V V + RL++G L + + F + S ++ A+ + + +L WP
Sbjct: 247 ----PAPKRVGVCVWRLATGAPLRHVSERFGLGISTCHKLVIEVCRAIYDVLMPKYLLWP 302
Query: 143 STETAMEEIKFKFENIRGLSNCCGAVDSTHILMTLPSGD-----TENSVWLDRKKNCSMI 197
S ++ + K KFE++ + N G++ +THI + P + ++K + S+
Sbjct: 303 S-DSEINSTKAKFESVHKIPNVVGSIYTTHIPIIAPKVHVAAYFNKRHTERNQKTSYSIT 361
Query: 198 LQAIVDPDLRFRDVVGGWPGSLSDEYVLRHSEFFKLAEEGKRLNGAEKMLPEGTALREYI 257
+Q +V+ D F DV G PGSL+D+ +L S R A ML + +I
Sbjct: 362 VQGVVNADGIFTDVCIGNPGSLTDDQILEKSSL-------SRQRAARGMLRDS-----WI 409
Query: 258 IGDTGFPLLPWLLTPYECKDLSDVEVEFNKRVVATHMVAKRALARLKQMWKIIQGVMWKP 317
+G++GFPL +LL PY ++L+ + FN+ + +A A RLK W +Q +
Sbjct: 410 VGNSGFPLTDYLLVPYTRQNLTWTQHAFNESIGEIQGIATAAFERLKGRWACLQKRT-EV 468
Query: 318 DKHKLPRIVLVCCILHNIVIDMEDEVMDEV 347
LP ++ CC+LHNI ++E++ E+
Sbjct: 469 KLQDLPYVLGACCVLHNICEMRKEEMLPEL 498
>AT1G72270.2 | Symbols: | LOCATED IN: mitochondrion; EXPRESSED IN:
shoot apex, embryo, flower, seed; EXPRESSED DURING:
petal differentiation and expansion stage, E expanded
cotyledon stage, D bilateral stage; BEST Arabidopsis
thaliana protein match is: PIF / Ping-Pong family of
plant transposases (TAIR:AT3G55350.1). |
chr1:27209890-27211122 REVERSE LENGTH=410
Length = 410
Score = 107 bits (268), Expect = 1e-23, Method: Compositional matrix adjust.
Identities = 59/167 (35%), Positives = 94/167 (56%), Gaps = 8/167 (4%)
Query: 195 SMILQAIVDPDLRFRDVVGGWPGSLSDEYVLRHSEFFKLAEEGKRLNGAEKMLPEGTALR 254
S+++QA+VD + RF D+ GWP ++ E + R ++ F +AEE L+GA L G +
Sbjct: 206 SILVQALVDSNGRFVDISAGWPSTMKPEAIFRQTKLFSIAEE--VLSGAPTKLGNGVLVP 263
Query: 255 EYIIGDTGFPLLPWLLTPYE-CKDLSDVEVEFNKRVVATHMVAKRALARLKQMWKIIQGV 313
YI+GD+ PLLPWL+TPY+ D EFN V + A A+++ W+I+
Sbjct: 264 RYILGDSCLPLLPWLVTPYDLTSDEESFREEFNNVVHTGLHSVEIAFAKVRARWRILDK- 322
Query: 314 MWKPDKHK-LPRIVLVCCILHNIVI---DMEDEVMDEVPLCPQHDSG 356
WKP+ + +P ++ C+LHN ++ D +D V + V C D+G
Sbjct: 323 KWKPETIEFMPFVITTGCLLHNFLVNSGDDDDSVEECVNGCEAGDNG 369
>AT1G72270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Ribosome 60S
biogenesis N-terminal (InterPro:IPR021714); BEST
Arabidopsis thaliana protein match is: unknown protein
(TAIR:AT4G27010.1); Has 772 Blast hits to 657 proteins
in 120 species: Archae - 0; Bacteria - 0; Metazoa - 344;
Fungi - 94; Plants - 322; Viruses - 0; Other Eukaryotes
- 12 (source: NCBI BLink). | chr1:27199733-27211122
REVERSE LENGTH=2845
Length = 2845
Score = 106 bits (264), Expect = 3e-23, Method: Compositional matrix adjust.
Identities = 59/167 (35%), Positives = 94/167 (56%), Gaps = 8/167 (4%)
Query: 195 SMILQAIVDPDLRFRDVVGGWPGSLSDEYVLRHSEFFKLAEEGKRLNGAEKMLPEGTALR 254
S+++QA+VD + RF D+ GWP ++ E + R ++ F +AEE L+GA L G +
Sbjct: 206 SILVQALVDSNGRFVDISAGWPSTMKPEAIFRQTKLFSIAEE--VLSGAPTKLGNGVLVP 263
Query: 255 EYIIGDTGFPLLPWLLTPYE-CKDLSDVEVEFNKRVVATHMVAKRALARLKQMWKIIQGV 313
YI+GD+ PLLPWL+TPY+ D EFN V + A A+++ W+I+
Sbjct: 264 RYILGDSCLPLLPWLVTPYDLTSDEESFREEFNNVVHTGLHSVEIAFAKVRARWRILDK- 322
Query: 314 MWKPDKHK-LPRIVLVCCILHNIVI---DMEDEVMDEVPLCPQHDSG 356
WKP+ + +P ++ C+LHN ++ D +D V + V C D+G
Sbjct: 323 KWKPETIEFMPFVITTGCLLHNFLVNSGDDDDSVEECVNGCEAGDNG 369
>AT3G19120.1 | Symbols: | PIF / Ping-Pong family of plant
transposases | chr3:6609678-6611018 REVERSE LENGTH=446
Length = 446
Score = 87.4 bits (215), Expect = 2e-17, Method: Compositional matrix adjust.
Identities = 69/259 (26%), Positives = 122/259 (47%), Gaps = 9/259 (3%)
Query: 82 LNGNHLSLNDQVAVA--LRRLSSGESLSTIGDSFLMNQSAVSQVTWLFVEAMEERGL-HH 138
+ ++LSL AVA L RL+ G S T+ + ++ +S++T + + +
Sbjct: 139 ITASNLSLPADYAVAMVLSRLAHGCSAKTLASRYSLDPYLISKITNMVTRLLATKLYPEF 198
Query: 139 LSWPSTETAMEEIKFKFENIRGLSNCCGAVDSTHILMTLPSGDTENSVWLDRKKNCSMIL 198
+ P + + E FE + L N CGA+DST + + + +++ + +++L
Sbjct: 199 IKIPVGKRRLIETTQGFEELTSLPNICGAIDSTPVKLRRRTKLNPRNIYGCKYGYDAVLL 258
Query: 199 QAIVDPDLRFRDVVGGWPGSLSDEYVLRHSEFFKLAEEGKRLNGAEKMLP-EGTALREYI 257
Q + D F DV PG D R S +K G + EK++ G +R YI
Sbjct: 259 QVVADHKKIFWDVCVKAPGGEDDSSHFRDSLLYKRLTSGDIV--WEKVINIRGHHVRPYI 316
Query: 258 IGDTGFPLLPWLLTPYECKDL-SDVEVEFNKRVVATHMVAKRALARLKQMWKIIQGVMWK 316
+GD +PLL +L+TP+ + E F+ ++ V A+ LK WKI+Q +
Sbjct: 317 VGDWCYPLLSFLMTPFSPNGSGTPPENLFDGMLMKGRSVVVEAIGLLKARWKILQSL--N 374
Query: 317 PDKHKLPRIVLVCCILHNI 335
+ P+ ++ CC+LHN+
Sbjct: 375 VGVNHAPQTIVACCVLHNL 393
>AT5G41980.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative
harbinger transposase-derived nuclease
(InterPro:IPR006912); BEST Arabidopsis thaliana protein
match is: unknown protein (TAIR:AT1G43722.1); Has 1807
Blast hits to 1807 proteins in 277 species: Archae - 0;
Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
Viruses - 0; Other Eukaryotes - 339 (source: NCBI
BLink). | chr5:16793765-16794889 FORWARD LENGTH=374
Length = 374
Score = 62.8 bits (151), Expect = 4e-10, Method: Compositional matrix adjust.
Identities = 50/179 (27%), Positives = 77/179 (43%), Gaps = 20/179 (11%)
Query: 161 LSNCCGAVDSTHILMTLPSGDTENSVWLDRKKNCSMILQAIVDPDLRFRDVVGGWPGSLS 220
+C G VDS HI + + G E + + + + A DLRF V+ GW GS S
Sbjct: 140 FKDCVGVVDSFHIPVMV--GVDEQGPFRNGNGLLTQNVLAASSFDLRFNYVLAGWEGSAS 197
Query: 221 DEYVLRHSEFFKLAEEGKRLNGAEKMLPEGTALREYIIGDTGFPLLPWLLTPYE---CKD 277
D+ VL + L K +P+G +Y I D +P LP + PY
Sbjct: 198 DQQVLNAA----LTRRNKL------QVPQG----KYYIVDNKYPNLPGFIAPYHGVSTNS 243
Query: 278 LSDVEVEFNKRVVATHMVAKRALARLKQMWKIIQGVMWKPDKHKLPRIVLVCCILHNIV 336
+ + FN+R H R LK+ + I+ P + ++ ++V+ C LHN V
Sbjct: 244 REEAKEMFNERHKLLHRAIHRTFGALKERFPILLSAPPYPLQTQV-KLVIAACALHNYV 301