Miyakogusa Predicted Gene

Lj4g3v0619970.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj4g3v0619970.1 Non Chatacterized Hit- tr|Q9SZQ1|Q9SZQ1_ARATH
Putative uncharacterized protein AT4g29780
OS=Arabidop,25.68,2e-18,seg,NULL; DDE_4,NULL;
UNCHARACTERIZED,Harbinger transposase-derived nuclease,CUFF.47644.1
         (414 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT3G55350.1 | Symbols:  | PIF / Ping-Pong family of plant transp...   110   1e-24
AT3G63270.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Putative h...   104   1e-22
AT4G29780.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...    99   4e-21
AT5G12010.1 | Symbols:  | unknown protein; INVOLVED IN: response...    99   5e-21
AT1G72270.2 | Symbols:  | LOCATED IN: mitochondrion; EXPRESSED I...    71   2e-12
AT1G72270.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Ribosome 6...    70   3e-12
AT3G19120.1 | Symbols:  | PIF / Ping-Pong family of plant transp...    65   8e-11

>AT3G55350.1 | Symbols:  | PIF / Ping-Pong family of plant
           transposases | chr3:20518518-20520690 FORWARD LENGTH=406
          Length = 406

 Score =  110 bits (276), Expect = 1e-24,   Method: Compositional matrix adjust.
 Identities = 88/332 (26%), Positives = 149/332 (44%), Gaps = 36/332 (10%)

Query: 68  RKRHGGENQNPPPRSPDWFRASFLMTSSTFEWLSGLLEPLLDCRDPADLLPLN---LSSG 124
           R+ +GG         P  F + F ++  TF+++  L++     + PA+    N   LS  
Sbjct: 60  RRIYGGST------DPKTFESVFKISRKTFDYICSLVKADFTAK-PANFSDSNGNPLSLN 112

Query: 125 VRLGIGLFRLATGSDYADISDRFGVSVPVARFCVKQLCRVLCTNFRFWISFPGPNDLPSV 184
            R+ + L RL +G   + I + FG++         +    +       +S+P  + L  +
Sbjct: 113 DRVAVALRRLGSGESLSVIGETFGMNQSTVSQITWRFVESMEERAIHHLSWP--SKLDEI 170

Query: 185 SHGFESLSGLPNCCGVVFCSRF-----DVSPSAAT--KNAQNHRVAAQIVVDSTCKILSI 237
              FE +SGLPNCCG +  +        V PS        +N  +  Q VVD   + L +
Sbjct: 171 KSKFEKISGLPNCCGAIDITHIVMNLPAVEPSNKVWLDGEKNFSMTLQAVVDPDMRFLDV 230

Query: 238 AAGFLGEKSDSQILKASTLCEDIEDGTLLNA---PSTDNAGVNQYLVGDSGYPLLPWLMV 294
            AG+ G  +D  +LK S   + +E G  LN    P ++   + +Y+VGDSG+PLLPWL+ 
Sbjct: 231 IAGWPGSLNDDVVLKNSGFYKLVEKGKRLNGEKLPLSERTELREYIVGDSGFPLLPWLLT 290

Query: 295 PFAEAAPGSVEGNFNEVH------GLMRLTALRTEASLRNWGVLSKPVKEEVKMAVAYMG 348
           P+        +  FN+ H        M L+ L+    + N GV+  P +  +   +    
Sbjct: 291 PYQGKPTSLPQTEFNKRHSEATKAAQMALSKLKDRWRIIN-GVMWMPDRNRLPRIIF--- 346

Query: 349 ACSILHNSLLMREDFS----ALATGFDFDYHE 376
            C +LHN ++  ED +     L+   D +Y +
Sbjct: 347 VCCLLHNIIIDMEDQTLDDQPLSQQHDMNYRQ 378


>AT3G63270.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Putative
           harbinger transposase-derived nuclease
           (InterPro:IPR006912); BEST Arabidopsis thaliana protein
           match is: PIF / Ping-Pong family of plant transposases
           (TAIR:AT3G55350.1); Has 30201 Blast hits to 17322
           proteins in 780 species: Archae - 12; Bacteria - 1396;
           Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
           0; Other Eukaryotes - 2996 (source: NCBI BLink). |
           chr3:23375932-23377398 REVERSE LENGTH=396
          Length = 396

 Score =  104 bits (259), Expect = 1e-22,   Method: Compositional matrix adjust.
 Identities = 76/292 (26%), Positives = 133/292 (45%), Gaps = 14/292 (4%)

Query: 86  FRASFLMTSSTFEWLSGLLEPLLDCRDPADLLPLN---LSSGVRLGIGLFRLATGSDYAD 142
           F+  F  + +TF ++  L+   L  R P+ L+ +    LS   ++ I L RLA+G     
Sbjct: 65  FKHFFRASKTTFSYICSLVREDLISRPPSGLINIEGRLLSVEKQVAIALRRLASGDSQVS 124

Query: 143 ISDRFGVSVPVARFCVKQLCRVLCTNFRFWISFPGPNDLPSVSHGFESLSGLPNCCGVVF 202
           +   FGV          +    L    +  + +P  + +  +   FE + GLPNCCG + 
Sbjct: 125 VGAAFGVGQSTVSQVTWRFIEALEERAKHHLRWPDSDRIEEIKSKFEEMYGLPNCCGAID 184

Query: 203 CSRFDVSPSAAT------KNAQNHRVAAQIVVDSTCKILSIAAGFLGEKSDSQILKASTL 256
            +   ++  A           +N+ +  Q V D   + L++  G+ G  + S++LK S  
Sbjct: 185 TTHIIMTLPAVQASDDWCDQEKNYSMFLQGVFDHEMRFLNMVTGWPGGMTVSKLLKFSGF 244

Query: 257 CEDIEDGTLLNA-PST--DNAGVNQYLVGDSGYPLLPWLMVPFAEAAPGSVEGNFNEVHG 313
            +  E+  +L+  P T    A + +Y+VG   YPLLPWL+ P     P      FNE H 
Sbjct: 245 FKLCENAQILDGNPKTLSQGAQIREYVVGGISYPLLPWLITPHDSDHPSDSMVAFNERHE 304

Query: 314 LMRLTALRTEASLR-NWGVLSKPV-KEEVKMAVAYMGACSILHNSLLMREDF 363
            +R  A      L+ +W +LSK + + + +   + +  C +LHN ++   D+
Sbjct: 305 KVRSVAATAFQQLKGSWRILSKVMWRPDRRKLPSIILVCCLLHNIIIDCGDY 356


>AT4G29780.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT5G12010.1); Has 945 Blast hits to 944 proteins
           in 87 species: Archae - 0; Bacteria - 0; Metazoa - 519;
           Fungi - 43; Plants - 365; Viruses - 0; Other Eukaryotes
           - 18 (source: NCBI BLink). | chr4:14579859-14581481
           FORWARD LENGTH=540
          Length = 540

 Score = 99.4 bits (246), Expect = 4e-21,   Method: Compositional matrix adjust.
 Identities = 76/296 (25%), Positives = 135/296 (45%), Gaps = 31/296 (10%)

Query: 84  DWFRASFLMTSSTFEWLSGLLEPLLDCRDPADLLPLNLSSGVRLGIGLFRLATGSDYADI 143
           D FR  F M+ STF  +   L+  +  ++   +L   + +  R+G+ ++RLATG+    +
Sbjct: 211 DEFRREFRMSKSTFNLICEELDTTVTKKNT--MLRDAIPAPKRVGVCVWRLATGAPLRHV 268

Query: 144 SDRFGVSVPVARFCVKQLCR----VLCTNFRFWISFPGPNDLPSVSHGFESLSGLPNCCG 199
           S+RFG+ +      V ++CR    VL   +  W   P  +++ S    FES+  +PN  G
Sbjct: 269 SERFGLGISTCHKLVIEVCRAIYDVLMPKYLLW---PSDSEINSTKAKFESVHKIPNVVG 325

Query: 200 VVFCSRFD-VSPSAATKNAQNHR-----------VAAQIVVDSTCKILSIAAGFLGEKSD 247
            ++ +    ++P        N R           +  Q VV++      +  G  G  +D
Sbjct: 326 SIYTTHIPIIAPKVHVAAYFNKRHTERNQKTSYSITVQGVVNADGIFTDVCIGNPGSLTD 385

Query: 248 SQILKASTLCEDIEDGTLLNAPSTDNAGVNQYLVGDSGYPLLPWLMVPFAEAAPGSVEGN 307
            QIL+ S+L        +L          + ++VG+SG+PL  +L+VP+        +  
Sbjct: 386 DQILEKSSLSRQRAARGMLR---------DSWIVGNSGFPLTDYLLVPYTRQNLTWTQHA 436

Query: 308 FNEVHGLMRLTALRTEASLR-NWGVLSKPVKEEVKMAVAYMGACSILHNSLLMRED 362
           FNE  G ++  A      L+  W  L K  + +++     +GAC +LHN   MR++
Sbjct: 437 FNESIGEIQGIATAAFERLKGRWACLQKRTEVKLQDLPYVLGACCVLHNICEMRKE 492


>AT5G12010.1 | Symbols:  | unknown protein; INVOLVED IN: response to
           salt stress; LOCATED IN: chloroplast, plasma membrane,
           membrane; EXPRESSED IN: 23 plant structures; EXPRESSED
           DURING: 13 growth stages; BEST Arabidopsis thaliana
           protein match is: unknown protein (TAIR:AT4G29780.1);
           Has 1807 Blast hits to 1807 proteins in 277 species:
           Archae - 0; Bacteria - 0; Metazoa - 736; Fungi - 347;
           Plants - 385; Viruses - 0; Other Eukaryotes - 339
           (source: NCBI BLink). | chr5:3877975-3879483 REVERSE
           LENGTH=502
          Length = 502

 Score = 99.0 bits (245), Expect = 5e-21,   Method: Compositional matrix adjust.
 Identities = 75/296 (25%), Positives = 133/296 (44%), Gaps = 35/296 (11%)

Query: 86  FRASFLMTSSTFEWLSGLLEPLLDCRDPA--DLLPLNLSSGVRLGIGLFRLATGSDYADI 143
           F+ +F M+ STFE +   L   +   D A  + +P+      R+ + ++RLATG     +
Sbjct: 175 FKKAFRMSKSTFELICDELNSAVAKEDTALRNAIPVRQ----RVAVCIWRLATGEPLRLV 230

Query: 144 SDRFGVSVPVARFCVKQLCR----VLCTNFRFWISFPGPNDLPSVSHGFESLSGLPNCCG 199
           S +FG+ +      V ++C+    VL   +  W   P    L ++   FES+SG+PN  G
Sbjct: 231 SKKFGLGISTCHKLVLEVCKAIKDVLMPKYLQW---PDDESLRNIRERFESVSGIPNVVG 287

Query: 200 VVFCSRFDV-SPSAATKNAQNHR-----------VAAQIVVDSTCKILSIAAGFLGEKSD 247
            ++ +   + +P  +  +  N R           +  Q VV+       +  G+ G   D
Sbjct: 288 SMYTTHIPIIAPKISVASYFNKRHTERNQKTSYSITIQAVVNPKGVFTDLCIGWPGSMPD 347

Query: 248 SQILKASTLCEDIEDGTLLNAPSTDNAGVNQYLVGDSGYPLLPWLMVPFAEAAPGSVEGN 307
            ++L+ S L +   +G LL            ++ G  G+PLL W++VP+ +      +  
Sbjct: 348 DKVLEKSLLYQRANNGGLLKG---------MWVAGGPGHPLLDWVLVPYTQQNLTWTQHA 398

Query: 308 FNEVHGLMRLTALRTEASLR-NWGVLSKPVKEEVKMAVAYMGACSILHNSLLMRED 362
           FNE    ++  A      L+  W  L K  + +++     +GAC +LHN   MRE+
Sbjct: 399 FNEKMSEVQGVAKEAFGRLKGRWACLQKRTEVKLQDLPTVLGACCVLHNICEMREE 454


>AT1G72270.2 | Symbols:  | LOCATED IN: mitochondrion; EXPRESSED IN:
           shoot apex, embryo, flower, seed; EXPRESSED DURING:
           petal differentiation and expansion stage, E expanded
           cotyledon stage, D bilateral stage; BEST Arabidopsis
           thaliana protein match is: PIF / Ping-Pong family of
           plant transposases (TAIR:AT3G55350.1). |
           chr1:27209890-27211122 REVERSE LENGTH=410
          Length = 410

 Score = 70.9 bits (172), Expect = 2e-12,   Method: Compositional matrix adjust.
 Identities = 66/238 (27%), Positives = 102/238 (42%), Gaps = 19/238 (7%)

Query: 131 LFRLATGSDYADISDRFGV-SVPVARFCVKQLCRVLCTNFRFWISFPGPNDLPSVSHGFE 189
           +FRLA G+ Y  +  RFG  S   A      +C+++       +  P P+  P++     
Sbjct: 127 IFRLAHGASYECLVHRFGFDSTSQASRSFFTVCKLINEKLSQQLDDPKPDFSPNL----- 181

Query: 190 SLSGLPNCCGVVFCSRFDVSPSAATKNAQNHRVAAQIVVDSTCKILSIAAGFLGEKSDSQ 249
               LPNC GVV   RF+V             +  Q +VDS  + + I+AG+        
Sbjct: 182 ----LPNCYGVVGFGRFEVKGKLLGAKGS---ILVQALVDSNGRFVDISAGWPSTMKPEA 234

Query: 250 ILKASTLCEDIEDGTLLNAPSTDNAGV--NQYLVGDSGYPLLPWLMVPF-AEAAPGSVEG 306
           I + + L   I +  L  AP+    GV   +Y++GDS  PLLPWL+ P+   +   S   
Sbjct: 235 IFRQTKLFS-IAEEVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVTPYDLTSDEESFRE 293

Query: 307 NFNE-VHGLMRLTALRTEASLRNWGVLSKPVK-EEVKMAVAYMGACSILHNSLLMRED 362
            FN  VH  +    +        W +L K  K E ++     +    +LHN L+   D
Sbjct: 294 EFNNVVHTGLHSVEIAFAKVRARWRILDKKWKPETIEFMPFVITTGCLLHNFLVNSGD 351


>AT1G72270.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Ribosome 60S
           biogenesis N-terminal (InterPro:IPR021714); BEST
           Arabidopsis thaliana protein match is: unknown protein
           (TAIR:AT4G27010.1); Has 772 Blast hits to 657 proteins
           in 120 species: Archae - 0; Bacteria - 0; Metazoa - 344;
           Fungi - 94; Plants - 322; Viruses - 0; Other Eukaryotes
           - 12 (source: NCBI BLink). | chr1:27199733-27211122
           REVERSE LENGTH=2845
          Length = 2845

 Score = 69.7 bits (169), Expect = 3e-12,   Method: Compositional matrix adjust.
 Identities = 65/234 (27%), Positives = 101/234 (43%), Gaps = 19/234 (8%)

Query: 131 LFRLATGSDYADISDRFGV-SVPVARFCVKQLCRVLCTNFRFWISFPGPNDLPSVSHGFE 189
           +FRLA G+ Y  +  RFG  S   A      +C+++       +  P P+  P++     
Sbjct: 127 IFRLAHGASYECLVHRFGFDSTSQASRSFFTVCKLINEKLSQQLDDPKPDFSPNL----- 181

Query: 190 SLSGLPNCCGVVFCSRFDVSPSAATKNAQNHRVAAQIVVDSTCKILSIAAGFLGEKSDSQ 249
               LPNC GVV   RF+V             +  Q +VDS  + + I+AG+        
Sbjct: 182 ----LPNCYGVVGFGRFEVKGKLLGAKGS---ILVQALVDSNGRFVDISAGWPSTMKPEA 234

Query: 250 ILKASTLCEDIEDGTLLNAPSTDNAGV--NQYLVGDSGYPLLPWLMVPF-AEAAPGSVEG 306
           I + + L   I +  L  AP+    GV   +Y++GDS  PLLPWL+ P+   +   S   
Sbjct: 235 IFRQTKLF-SIAEEVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVTPYDLTSDEESFRE 293

Query: 307 NFNE-VHGLMRLTALRTEASLRNWGVLSKPVK-EEVKMAVAYMGACSILHNSLL 358
            FN  VH  +    +        W +L K  K E ++     +    +LHN L+
Sbjct: 294 EFNNVVHTGLHSVEIAFAKVRARWRILDKKWKPETIEFMPFVITTGCLLHNFLV 347


>AT3G19120.1 | Symbols:  | PIF / Ping-Pong family of plant
           transposases | chr3:6609678-6611018 REVERSE LENGTH=446
          Length = 446

 Score = 65.1 bits (157), Expect = 8e-11,   Method: Compositional matrix adjust.
 Identities = 74/298 (24%), Positives = 124/298 (41%), Gaps = 33/298 (11%)

Query: 77  NPPPRSPDWFRASFLMTSSTFEWLSGLLEPLLDCRDPADLLPLNLSSGVRLGIGLFRLAT 136
           + P R   W R+ + ++   F  +   L+P +   +      L+L +   + + L RLA 
Sbjct: 109 DAPLRDARW-RSLYGLSYPVFITVVDKLKPFITASN------LSLPADYAVAMVLSRLAH 161

Query: 137 GSDYADISDRFGVSVPVARFCVKQLCRVLCTN-FRFWISFP-GPNDLPSVSHGFESLSGL 194
           G     ++ R+ +   +       + R+L T  +  +I  P G   L   + GFE L+ L
Sbjct: 162 GCSAKTLASRYSLDPYLISKITNMVTRLLATKLYPEFIKIPVGKRRLIETTQGFEELTSL 221

Query: 195 PNCCGVVFCSRFDVSPSAATKNAQ-NHR-----------VAAQIVVDSTCKILSIAAGFL 242
           PN CG +     D +P    +  + N R           V  Q+V D       +     
Sbjct: 222 PNICGAI-----DSTPVKLRRRTKLNPRNIYGCKYGYDAVLLQVVADHKKIFWDVCVKAP 276

Query: 243 GEKSDSQILKASTLCEDIEDGTLLNAPSTDNAG--VNQYLVGDSGYPLLPWLMVPFAEAA 300
           G + DS   + S L + +  G ++     +  G  V  Y+VGD  YPLL +LM PF+   
Sbjct: 277 GGEDDSSHFRDSLLYKRLTSGDIVWEKVINIRGHHVRPYIVGDWCYPLLSFLMTPFSPNG 336

Query: 301 PGSVEGNFNEVHGLMRLTALRTEAS---LRNWGVLSKPVKEEVKMAVAYMGACSILHN 355
            G+   N  +   LM+  ++  EA       W +L + +   V  A   + AC +LHN
Sbjct: 337 SGTPPENLFD-GMLMKGRSVVVEAIGLLKARWKIL-QSLNVGVNHAPQTIVACCVLHN 392