Miyakogusa Predicted Gene

Lj1g3v4446900.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj1g3v4446900.1 Non Chatacterized Hit- tr|I1IJ83|I1IJ83_BRADI
Uncharacterized protein OS=Brachypodium distachyon
GN=,29.94,0.00000000000004,DDE_4,NULL; coiled-coil,NULL; seg,NULL;
UNCHARACTERIZED,Harbinger transposase-derived
nuclease,NODE_70855_length_1398_cov_16.856939.path1.1
         (394 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT3G63270.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Putative h...   535   e-152
AT3G55350.1 | Symbols:  | PIF / Ping-Pong family of plant transp...   343   2e-94
AT5G12010.1 | Symbols:  | unknown protein; INVOLVED IN: response...   155   3e-38
AT4G29780.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...   129   3e-30
AT3G19120.1 | Symbols:  | PIF / Ping-Pong family of plant transp...   103   2e-22
AT1G72270.2 | Symbols:  | LOCATED IN: mitochondrion; EXPRESSED I...    89   4e-18
AT1G72270.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Ribosome 6...    88   1e-17

>AT3G63270.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Putative
           harbinger transposase-derived nuclease
           (InterPro:IPR006912); BEST Arabidopsis thaliana protein
           match is: PIF / Ping-Pong family of plant transposases
           (TAIR:AT3G55350.1); Has 30201 Blast hits to 17322
           proteins in 780 species: Archae - 12; Bacteria - 1396;
           Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
           0; Other Eukaryotes - 2996 (source: NCBI BLink). |
           chr3:23375932-23377398 REVERSE LENGTH=396
          Length = 396

 Score =  535 bits (1377), Expect = e-152,   Method: Compositional matrix adjust.
 Identities = 253/368 (68%), Positives = 296/368 (80%), Gaps = 4/368 (1%)

Query: 24  VNPVSVEPRTSETDWWESFWHKNSTAPGYSVSGDEEEGFKYFFRVSKTTFEYICSLVRQD 83
           VN V ++P   + DWW++FW +NS+    SV  DE+  FK+FFR SKTTF YICSLVR+D
Sbjct: 30  VNAVPLDPEAIDCDWWDTFWLRNSSP---SVPSDEDYAFKHFFRASKTTFSYICSLVRED 86

Query: 84  LISRPPSGLINIEGRLLSVEKQVAIALRRLASGESQVSVGASFGVGQSTVSQVTWRFIEA 143
           LISRPPSGLINIEGRLLSVEKQVAIALRRLASG+SQVSVGA+FGVGQSTVSQVTWRFIEA
Sbjct: 87  LISRPPSGLINIEGRLLSVEKQVAIALRRLASGDSQVSVGAAFGVGQSTVSQVTWRFIEA 146

Query: 144 LEERATHHLNWPDCNRMQEIKFGFEASYGLPNCCGALDATHIMMTLPAVETSYDWCDQEK 203
           LEERA HHL WPD +R++EIK  FE  YGLPNCCGA+D THI+MTLPAV+ S DWCDQEK
Sbjct: 147 LEERAKHHLRWPDSDRIEEIKSKFEEMYGLPNCCGAIDTTHIIMTLPAVQASDDWCDQEK 206

Query: 204 NYSMLFQGIVDHEMRFIDIMTGLPGGMTFSRFLKCSGFYRLSQNGERLNGNVRTLG-GDV 262
           NYSM  QG+ DHEMRF++++TG PGGMT S+ LK SGF++L +N + L+GN +TL  G  
Sbjct: 207 NYSMFLQGVFDHEMRFLNMVTGWPGGMTVSKLLKFSGFFKLCENAQILDGNPKTLSQGAQ 266

Query: 263 IREYVVGGYSYPLLPWLMTPYETNGISDSQSTFNYKHGAARLLAVRAFSLLKGSWRILSK 322
           IREYVVGG SYPLLPWL+TP++++  SDS   FN +H   R +A  AF  LKGSWRILSK
Sbjct: 267 IREYVVGGISYPLLPWLITPHDSDHPSDSMVAFNERHEKVRSVAATAFQQLKGSWRILSK 326

Query: 323 VMWRPDKRKLPSIILTCCLLHNIVIDCGDTLHPDVALSAHHDSGYQEQYCKQVDPSGRTM 382
           VMWRPD+RKLPSIIL CCLLHNI+IDCGD L  DV LS HHDSGY ++YCKQ +P G  +
Sbjct: 327 VMWRPDRRKLPSIILVCCLLHNIIIDCGDYLQEDVPLSGHHDSGYADRYCKQTEPLGSEL 386

Query: 383 RENLARHL 390
           R  L  HL
Sbjct: 387 RGCLTEHL 394


>AT3G55350.1 | Symbols:  | PIF / Ping-Pong family of plant
           transposases | chr3:20518518-20520690 FORWARD LENGTH=406
          Length = 406

 Score =  343 bits (879), Expect = 2e-94,   Method: Compositional matrix adjust.
 Identities = 170/359 (47%), Positives = 228/359 (63%), Gaps = 17/359 (4%)

Query: 37  DWWESFWHK---NSTAPGYSVSGDEEEGFKYFFRVSKTTFEYICSLVRQDLISRPPSGLI 93
           DWW+ F  +    ST P         + F+  F++S+ TF+YICSLV+ D  ++P +   
Sbjct: 53  DWWDGFSRRIYGGSTDP---------KTFESVFKISRKTFDYICSLVKADFTAKP-ANFS 102

Query: 94  NIEGRLLSVEKQVAIALRRLASGESQVSVGASFGVGQSTVSQVTWRFIEALEERATHHLN 153
           +  G  LS+  +VA+ALRRL SGES   +G +FG+ QSTVSQ+TWRF+E++EERA HHL+
Sbjct: 103 DSNGNPLSLNDRVAVALRRLGSGESLSVIGETFGMNQSTVSQITWRFVESMEERAIHHLS 162

Query: 154 WPDCNRMQEIKFGFEASYGLPNCCGALDATHIMMTLPAVETSYD-WCDQEKNYSMLFQGI 212
           WP  +++ EIK  FE   GLPNCCGA+D THI+M LPAVE S   W D EKN+SM  Q +
Sbjct: 163 WP--SKLDEIKSKFEKISGLPNCCGAIDITHIVMNLPAVEPSNKVWLDGEKNFSMTLQAV 220

Query: 213 VDHEMRFIDIMTGLPGGMTFSRFLKCSGFYRLSQNGERLNGNVRTLGGDV-IREYVVGGY 271
           VD +MRF+D++ G PG +     LK SGFY+L + G+RLNG    L     +REY+VG  
Sbjct: 221 VDPDMRFLDVIAGWPGSLNDDVVLKNSGFYKLVEKGKRLNGEKLPLSERTELREYIVGDS 280

Query: 272 SYPLLPWLMTPYETNGISDSQSTFNYKHGAARLLAVRAFSLLKGSWRILSKVMWRPDKRK 331
            +PLLPWL+TPY+    S  Q+ FN +H  A   A  A S LK  WRI++ VMW PD+ +
Sbjct: 281 GFPLLPWLLTPYQGKPTSLPQTEFNKRHSEATKAAQMALSKLKDRWRIINGVMWMPDRNR 340

Query: 332 LPSIILTCCLLHNIVIDCGDTLHPDVALSAHHDSGYQEQYCKQVDPSGRTMRENLARHL 390
           LP II  CCLLHNI+ID  D    D  LS  HD  Y+++ CK  D +   +R+ L+  L
Sbjct: 341 LPRIIFVCCLLHNIIIDMEDQTLDDQPLSQQHDMNYRQRSCKLADEASSVLRDELSDQL 399


>AT5G12010.1 | Symbols:  | unknown protein; INVOLVED IN: response to
           salt stress; LOCATED IN: chloroplast, plasma membrane,
           membrane; EXPRESSED IN: 23 plant structures; EXPRESSED
           DURING: 13 growth stages; BEST Arabidopsis thaliana
           protein match is: unknown protein (TAIR:AT4G29780.1);
           Has 1807 Blast hits to 1807 proteins in 277 species:
           Archae - 0; Bacteria - 0; Metazoa - 736; Fungi - 347;
           Plants - 385; Viruses - 0; Other Eukaryotes - 339
           (source: NCBI BLink). | chr5:3877975-3879483 REVERSE
           LENGTH=502
          Length = 502

 Score =  155 bits (393), Expect = 3e-38,   Method: Compositional matrix adjust.
 Identities = 100/341 (29%), Positives = 169/341 (49%), Gaps = 26/341 (7%)

Query: 59  EEGFKYFFRVSKTTFEYICSLVRQDLISRPPSGLINIEGRLLSVEKQVAIALRRLASGES 118
           EE FK  FR+SK+TFE IC  +    +++  + L N     + V ++VA+ + RLA+GE 
Sbjct: 172 EEDFKKAFRMSKSTFELICDEL-NSAVAKEDTALRNA----IPVRQRVAVCIWRLATGEP 226

Query: 119 QVSVGASFGVGQSTVSQVTWRFIEALEE-RATHHLNWPDCNRMQEIKFGFEASYGLPNCC 177
              V   FG+G ST  ++     +A+++     +L WPD   ++ I+  FE+  G+PN  
Sbjct: 227 LRLVSKKFGLGISTCHKLVLEVCKAIKDVLMPKYLQWPDDESLRNIRERFESVSGIPNVV 286

Query: 178 GALDATHIMMTLPAVETS------YDWCDQEKNYSMLFQGIVDHEMRFIDIMTGLPGGMT 231
           G++  THI +  P +  +      +   +Q+ +YS+  Q +V+ +  F D+  G PG M 
Sbjct: 287 GSMYTTHIPIIAPKISVASYFNKRHTERNQKTSYSITIQAVVNPKGVFTDLCIGWPGSMP 346

Query: 232 FSRFLKCSGFYRLSQNGERLNGNVRTLGGDVIREYVVGGYSYPLLPWLMTPYETNGISDS 291
             + L+ S  Y+ + NG  L G            +V GG  +PLL W++ PY    ++ +
Sbjct: 347 DDKVLEKSLLYQRANNGGLLKGM-----------WVAGGPGHPLLDWVLVPYTQQNLTWT 395

Query: 292 QSTFNYKHGAARLLAVRAFSLLKGSWRILSKVMWRPDKRKLPSIILTCCLLHNIVIDCGD 351
           Q  FN K    + +A  AF  LKG W  L K       + LP+++  CC+LHNI     +
Sbjct: 396 QHAFNEKMSEVQGVAKEAFGRLKGRWACLQKRT-EVKLQDLPTVLGACCVLHNICEMREE 454

Query: 352 TLHPDVALSAHHDSGYQEQYCKQVDPSGRTMRENLARHLRH 392
            + P++ +    D    E   + V+      R+ ++ +L H
Sbjct: 455 KMEPELMVEVIDDEVLPENVLRSVN--AMKARDTISHNLLH 493


>AT4G29780.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT5G12010.1); Has 945 Blast hits to 944 proteins
           in 87 species: Archae - 0; Bacteria - 0; Metazoa - 519;
           Fungi - 43; Plants - 365; Viruses - 0; Other Eukaryotes
           - 18 (source: NCBI BLink). | chr4:14579859-14581481
           FORWARD LENGTH=540
          Length = 540

 Score =  129 bits (325), Expect = 3e-30,   Method: Compositional matrix adjust.
 Identities = 98/371 (26%), Positives = 169/371 (45%), Gaps = 38/371 (10%)

Query: 29  VEPRTSETDWWESFWHKNSTAPGYSVSGDEEEGFKYFFRVSKTTFEYICSLVRQDLISRP 88
           V+ RT  TDWW+       + P +      E+ F+  FR+SK+TF  IC  +    +++ 
Sbjct: 192 VKERT--TDWWDRV-----SRPDFP-----EDEFRREFRMSKSTFNLICEEL-DTTVTKK 238

Query: 89  PSGLINIEGRLLSVEKQVAIALRRLASGESQVSVGASFGVGQSTVSQVTWRFIEALEE-R 147
            + L +     +   K+V + + RLA+G     V   FG+G ST  ++      A+ +  
Sbjct: 239 NTMLRDA----IPAPKRVGVCVWRLATGAPLRHVSERFGLGISTCHKLVIEVCRAIYDVL 294

Query: 148 ATHHLNWPDCNRMQEIKFGFEASYGLPNCCGALDATHIMMTLPAVETS------YDWCDQ 201
              +L WP  + +   K  FE+ + +PN  G++  THI +  P V  +      +   +Q
Sbjct: 295 MPKYLLWPSDSEINSTKAKFESVHKIPNVVGSIYTTHIPIIAPKVHVAAYFNKRHTERNQ 354

Query: 202 EKNYSMLFQGIVDHEMRFIDIMTGLPGGMTFSRFLKCSGFYRLSQNGERLNGNVRTLGGD 261
           + +YS+  QG+V+ +  F D+  G PG +T  + L+ S   R            R   G 
Sbjct: 355 KTSYSITVQGVVNADGIFTDVCIGNPGSLTDDQILEKSSLSRQ-----------RAARGM 403

Query: 262 VIREYVVGGYSYPLLPWLMTPYETNGISDSQSTFNYKHGAARLLAVRAFSLLKGSWRILS 321
           +   ++VG   +PL  +L+ PY    ++ +Q  FN   G  + +A  AF  LKG W  L 
Sbjct: 404 LRDSWIVGNSGFPLTDYLLVPYTRQNLTWTQHAFNESIGEIQGIATAAFERLKGRWACLQ 463

Query: 322 KVMWRPDKRKLPSIILTCCLLHNIVIDCGDTLHPDVALSAHHDSGYQEQYCKQVDPSGRT 381
           K       + LP ++  CC+LHNI     + + P++      D    E   +    S   
Sbjct: 464 KRT-EVKLQDLPYVLGACCVLHNICEMRKEEMLPELKFEVFDDVAVPENNIRSA--SAVN 520

Query: 382 MRENLARHLRH 392
            R++++ +L H
Sbjct: 521 TRDHISHNLLH 531


>AT3G19120.1 | Symbols:  | PIF / Ping-Pong family of plant
           transposases | chr3:6609678-6611018 REVERSE LENGTH=446
          Length = 446

 Score =  103 bits (258), Expect = 2e-22,   Method: Compositional matrix adjust.
 Identities = 73/250 (29%), Positives = 118/250 (47%), Gaps = 6/250 (2%)

Query: 100 LSVEKQVAIALRRLASGESQVSVGASFGVGQSTVSQVTWRFIEALEERA-THHLNWP-DC 157
           L  +  VA+ L RLA G S  ++ + + +    +S++T      L  +     +  P   
Sbjct: 146 LPADYAVAMVLSRLAHGCSAKTLASRYSLDPYLISKITNMVTRLLATKLYPEFIKIPVGK 205

Query: 158 NRMQEIKFGFEASYGLPNCCGALDATHIMMTLPAVETSYDWCDQEKNY-SMLFQGIVDHE 216
            R+ E   GFE    LPN CGA+D+T + +         +    +  Y ++L Q + DH+
Sbjct: 206 RRLIETTQGFEELTSLPNICGAIDSTPVKLRRRTKLNPRNIYGCKYGYDAVLLQVVADHK 265

Query: 217 MRFIDIMTGLPGGMTFSRFLKCSGFYRLSQNGERLNGNVRTLGGDVIREYVVGGYSYPLL 276
             F D+    PGG   S   + S  Y+   +G+ +   V  + G  +R Y+VG + YPLL
Sbjct: 266 KIFWDVCVKAPGGEDDSSHFRDSLLYKRLTSGDIVWEKVINIRGHHVRPYIVGDWCYPLL 325

Query: 277 PWLMTPYETNGI-SDSQSTFNYKHGAARLLAVRAFSLLKGSWRILSKVMWRPDKRKLPSI 335
            +LMTP+  NG  +  ++ F+      R + V A  LLK  W+IL  +         P  
Sbjct: 326 SFLMTPFSPNGSGTPPENLFDGMLMKGRSVVVEAIGLLKARWKILQSL--NVGVNHAPQT 383

Query: 336 ILTCCLLHNI 345
           I+ CC+LHN+
Sbjct: 384 IVACCVLHNL 393


>AT1G72270.2 | Symbols:  | LOCATED IN: mitochondrion; EXPRESSED IN:
           shoot apex, embryo, flower, seed; EXPRESSED DURING:
           petal differentiation and expansion stage, E expanded
           cotyledon stage, D bilateral stage; BEST Arabidopsis
           thaliana protein match is: PIF / Ping-Pong family of
           plant transposases (TAIR:AT3G55350.1). |
           chr1:27209890-27211122 REVERSE LENGTH=410
          Length = 410

 Score = 89.4 bits (220), Expect = 4e-18,   Method: Compositional matrix adjust.
 Identities = 79/292 (27%), Positives = 130/292 (44%), Gaps = 45/292 (15%)

Query: 65  FFRVSKTTFEYICSLVRQDLISRPPSGLINIEGRLLSVEKQVAIALRRLASGESQVSVGA 124
           +FR+SK+TF  + S++     S  PS                A  + RLA G S   +  
Sbjct: 100 YFRMSKSTFFSLYSILSH---SSLPS---------------FAATIFRLAHGASYECLVH 141

Query: 125 SFGV-GQSTVSQVTWRFIEALEERATHHLNWPDCNRMQEIKFGFEASYGLPNCCGALDAT 183
            FG    S  S+  +   + + E+ +  L+ P  +    +         LPNC G +   
Sbjct: 142 RFGFDSTSQASRSFFTVCKLINEKLSQQLDDPKPDFSPNL---------LPNCYGVVGFG 192

Query: 184 HIMMTLPAVETSYDWCDQEKNYSMLFQGIVDHEMRFIDIMTGLPGGMTFSRFLKCSGFYR 243
              +    +             S+L Q +VD   RF+DI  G P  M      + +  + 
Sbjct: 193 RFEVKGKLLGA---------KGSILVQALVDSNGRFVDISAGWPSTMKPEAIFRQTKLFS 243

Query: 244 LSQNGERLNGNVRTLG-GDVIREYVVGGYSYPLLPWLMTPYETNGISDS--QSTFNYKHG 300
           +++  E L+G    LG G ++  Y++G    PLLPWL+TPY+     +S  +   N  H 
Sbjct: 244 IAE--EVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVTPYDLTSDEESFREEFNNVVHT 301

Query: 301 AARLLAVRAFSLLKGSWRILSKVMWRPDKRK-LPSIILTCCLLHNIVIDCGD 351
               + + AF+ ++  WRIL K  W+P+  + +P +I T CLLHN +++ GD
Sbjct: 302 GLHSVEI-AFAKVRARWRILDK-KWKPETIEFMPFVITTGCLLHNFLVNSGD 351


>AT1G72270.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Ribosome 60S
           biogenesis N-terminal (InterPro:IPR021714); BEST
           Arabidopsis thaliana protein match is: unknown protein
           (TAIR:AT4G27010.1); Has 772 Blast hits to 657 proteins
           in 120 species: Archae - 0; Bacteria - 0; Metazoa - 344;
           Fungi - 94; Plants - 322; Viruses - 0; Other Eukaryotes
           - 12 (source: NCBI BLink). | chr1:27199733-27211122
           REVERSE LENGTH=2845
          Length = 2845

 Score = 87.8 bits (216), Expect = 1e-17,   Method: Compositional matrix adjust.
 Identities = 81/292 (27%), Positives = 131/292 (44%), Gaps = 45/292 (15%)

Query: 65  FFRVSKTTFEYICSLVRQDLISRPPSGLINIEGRLLSVEKQVAIALRRLASGESQVSVGA 124
           +FR+SK+TF  + S++     S  PS                A  + RLA G S   +  
Sbjct: 100 YFRMSKSTFFSLYSILSH---SSLPS---------------FAATIFRLAHGASYECLVH 141

Query: 125 SFGV-GQSTVSQVTWRFIEALEERATHHLNWPDCNRMQEIKFGFEASYGLPNCCGALDAT 183
            FG    S  S+  +   + + E+ +  L+ P        K  F  +  LPNC G +   
Sbjct: 142 RFGFDSTSQASRSFFTVCKLINEKLSQQLDDP--------KPDFSPNL-LPNCYGVVGFG 192

Query: 184 HIMMTLPAVETSYDWCDQEKNYSMLFQGIVDHEMRFIDIMTGLPGGMTFSRFLKCSGFYR 243
              +    +             S+L Q +VD   RF+DI  G P  M      + +  + 
Sbjct: 193 RFEVKGKLLGA---------KGSILVQALVDSNGRFVDISAGWPSTMKPEAIFRQTKLFS 243

Query: 244 LSQNGERLNGNVRTLG-GDVIREYVVGGYSYPLLPWLMTPYETNGISDS--QSTFNYKHG 300
           +++  E L+G    LG G ++  Y++G    PLLPWL+TPY+     +S  +   N  H 
Sbjct: 244 IAE--EVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVTPYDLTSDEESFREEFNNVVHT 301

Query: 301 AARLLAVRAFSLLKGSWRILSKVMWRPDKRK-LPSIILTCCLLHNIVIDCGD 351
               + + AF+ ++  WRIL K  W+P+  + +P +I T CLLHN +++ GD
Sbjct: 302 GLHSVEI-AFAKVRARWRILDK-KWKPETIEFMPFVITTGCLLHNFLVNSGD 351