Miyakogusa Predicted Gene
- Lj4g3v0619970.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj4g3v0619970.1 Non Chatacterized Hit- tr|Q9SZQ1|Q9SZQ1_ARATH
Putative uncharacterized protein AT4g29780
OS=Arabidop,25.68,2e-18,seg,NULL; DDE_4,NULL;
UNCHARACTERIZED,Harbinger transposase-derived nuclease,CUFF.47644.1
(414 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G55350.1 | Symbols: | PIF / Ping-Pong family of plant transp... 110 1e-24
AT3G63270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative h... 104 1e-22
AT4G29780.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 99 4e-21
AT5G12010.1 | Symbols: | unknown protein; INVOLVED IN: response... 99 5e-21
AT1G72270.2 | Symbols: | LOCATED IN: mitochondrion; EXPRESSED I... 71 2e-12
AT1G72270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Ribosome 6... 70 3e-12
AT3G19120.1 | Symbols: | PIF / Ping-Pong family of plant transp... 65 8e-11
>AT3G55350.1 | Symbols: | PIF / Ping-Pong family of plant
transposases | chr3:20518518-20520690 FORWARD LENGTH=406
Length = 406
Score = 110 bits (276), Expect = 1e-24, Method: Compositional matrix adjust.
Identities = 88/332 (26%), Positives = 149/332 (44%), Gaps = 36/332 (10%)
Query: 68 RKRHGGENQNPPPRSPDWFRASFLMTSSTFEWLSGLLEPLLDCRDPADLLPLN---LSSG 124
R+ +GG P F + F ++ TF+++ L++ + PA+ N LS
Sbjct: 60 RRIYGGST------DPKTFESVFKISRKTFDYICSLVKADFTAK-PANFSDSNGNPLSLN 112
Query: 125 VRLGIGLFRLATGSDYADISDRFGVSVPVARFCVKQLCRVLCTNFRFWISFPGPNDLPSV 184
R+ + L RL +G + I + FG++ + + +S+P + L +
Sbjct: 113 DRVAVALRRLGSGESLSVIGETFGMNQSTVSQITWRFVESMEERAIHHLSWP--SKLDEI 170
Query: 185 SHGFESLSGLPNCCGVVFCSRF-----DVSPSAAT--KNAQNHRVAAQIVVDSTCKILSI 237
FE +SGLPNCCG + + V PS +N + Q VVD + L +
Sbjct: 171 KSKFEKISGLPNCCGAIDITHIVMNLPAVEPSNKVWLDGEKNFSMTLQAVVDPDMRFLDV 230
Query: 238 AAGFLGEKSDSQILKASTLCEDIEDGTLLNA---PSTDNAGVNQYLVGDSGYPLLPWLMV 294
AG+ G +D +LK S + +E G LN P ++ + +Y+VGDSG+PLLPWL+
Sbjct: 231 IAGWPGSLNDDVVLKNSGFYKLVEKGKRLNGEKLPLSERTELREYIVGDSGFPLLPWLLT 290
Query: 295 PFAEAAPGSVEGNFNEVH------GLMRLTALRTEASLRNWGVLSKPVKEEVKMAVAYMG 348
P+ + FN+ H M L+ L+ + N GV+ P + + +
Sbjct: 291 PYQGKPTSLPQTEFNKRHSEATKAAQMALSKLKDRWRIIN-GVMWMPDRNRLPRIIF--- 346
Query: 349 ACSILHNSLLMREDFS----ALATGFDFDYHE 376
C +LHN ++ ED + L+ D +Y +
Sbjct: 347 VCCLLHNIIIDMEDQTLDDQPLSQQHDMNYRQ 378
>AT3G63270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative
harbinger transposase-derived nuclease
(InterPro:IPR006912); BEST Arabidopsis thaliana protein
match is: PIF / Ping-Pong family of plant transposases
(TAIR:AT3G55350.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr3:23375932-23377398 REVERSE LENGTH=396
Length = 396
Score = 104 bits (259), Expect = 1e-22, Method: Compositional matrix adjust.
Identities = 76/292 (26%), Positives = 133/292 (45%), Gaps = 14/292 (4%)
Query: 86 FRASFLMTSSTFEWLSGLLEPLLDCRDPADLLPLN---LSSGVRLGIGLFRLATGSDYAD 142
F+ F + +TF ++ L+ L R P+ L+ + LS ++ I L RLA+G
Sbjct: 65 FKHFFRASKTTFSYICSLVREDLISRPPSGLINIEGRLLSVEKQVAIALRRLASGDSQVS 124
Query: 143 ISDRFGVSVPVARFCVKQLCRVLCTNFRFWISFPGPNDLPSVSHGFESLSGLPNCCGVVF 202
+ FGV + L + + +P + + + FE + GLPNCCG +
Sbjct: 125 VGAAFGVGQSTVSQVTWRFIEALEERAKHHLRWPDSDRIEEIKSKFEEMYGLPNCCGAID 184
Query: 203 CSRFDVSPSAAT------KNAQNHRVAAQIVVDSTCKILSIAAGFLGEKSDSQILKASTL 256
+ ++ A +N+ + Q V D + L++ G+ G + S++LK S
Sbjct: 185 TTHIIMTLPAVQASDDWCDQEKNYSMFLQGVFDHEMRFLNMVTGWPGGMTVSKLLKFSGF 244
Query: 257 CEDIEDGTLLNA-PST--DNAGVNQYLVGDSGYPLLPWLMVPFAEAAPGSVEGNFNEVHG 313
+ E+ +L+ P T A + +Y+VG YPLLPWL+ P P FNE H
Sbjct: 245 FKLCENAQILDGNPKTLSQGAQIREYVVGGISYPLLPWLITPHDSDHPSDSMVAFNERHE 304
Query: 314 LMRLTALRTEASLR-NWGVLSKPV-KEEVKMAVAYMGACSILHNSLLMREDF 363
+R A L+ +W +LSK + + + + + + C +LHN ++ D+
Sbjct: 305 KVRSVAATAFQQLKGSWRILSKVMWRPDRRKLPSIILVCCLLHNIIIDCGDY 356
>AT4G29780.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT5G12010.1); Has 945 Blast hits to 944 proteins
in 87 species: Archae - 0; Bacteria - 0; Metazoa - 519;
Fungi - 43; Plants - 365; Viruses - 0; Other Eukaryotes
- 18 (source: NCBI BLink). | chr4:14579859-14581481
FORWARD LENGTH=540
Length = 540
Score = 99.4 bits (246), Expect = 4e-21, Method: Compositional matrix adjust.
Identities = 76/296 (25%), Positives = 135/296 (45%), Gaps = 31/296 (10%)
Query: 84 DWFRASFLMTSSTFEWLSGLLEPLLDCRDPADLLPLNLSSGVRLGIGLFRLATGSDYADI 143
D FR F M+ STF + L+ + ++ +L + + R+G+ ++RLATG+ +
Sbjct: 211 DEFRREFRMSKSTFNLICEELDTTVTKKNT--MLRDAIPAPKRVGVCVWRLATGAPLRHV 268
Query: 144 SDRFGVSVPVARFCVKQLCR----VLCTNFRFWISFPGPNDLPSVSHGFESLSGLPNCCG 199
S+RFG+ + V ++CR VL + W P +++ S FES+ +PN G
Sbjct: 269 SERFGLGISTCHKLVIEVCRAIYDVLMPKYLLW---PSDSEINSTKAKFESVHKIPNVVG 325
Query: 200 VVFCSRFD-VSPSAATKNAQNHR-----------VAAQIVVDSTCKILSIAAGFLGEKSD 247
++ + ++P N R + Q VV++ + G G +D
Sbjct: 326 SIYTTHIPIIAPKVHVAAYFNKRHTERNQKTSYSITVQGVVNADGIFTDVCIGNPGSLTD 385
Query: 248 SQILKASTLCEDIEDGTLLNAPSTDNAGVNQYLVGDSGYPLLPWLMVPFAEAAPGSVEGN 307
QIL+ S+L +L + ++VG+SG+PL +L+VP+ +
Sbjct: 386 DQILEKSSLSRQRAARGMLR---------DSWIVGNSGFPLTDYLLVPYTRQNLTWTQHA 436
Query: 308 FNEVHGLMRLTALRTEASLR-NWGVLSKPVKEEVKMAVAYMGACSILHNSLLMRED 362
FNE G ++ A L+ W L K + +++ +GAC +LHN MR++
Sbjct: 437 FNESIGEIQGIATAAFERLKGRWACLQKRTEVKLQDLPYVLGACCVLHNICEMRKE 492
>AT5G12010.1 | Symbols: | unknown protein; INVOLVED IN: response to
salt stress; LOCATED IN: chloroplast, plasma membrane,
membrane; EXPRESSED IN: 23 plant structures; EXPRESSED
DURING: 13 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT4G29780.1);
Has 1807 Blast hits to 1807 proteins in 277 species:
Archae - 0; Bacteria - 0; Metazoa - 736; Fungi - 347;
Plants - 385; Viruses - 0; Other Eukaryotes - 339
(source: NCBI BLink). | chr5:3877975-3879483 REVERSE
LENGTH=502
Length = 502
Score = 99.0 bits (245), Expect = 5e-21, Method: Compositional matrix adjust.
Identities = 75/296 (25%), Positives = 133/296 (44%), Gaps = 35/296 (11%)
Query: 86 FRASFLMTSSTFEWLSGLLEPLLDCRDPA--DLLPLNLSSGVRLGIGLFRLATGSDYADI 143
F+ +F M+ STFE + L + D A + +P+ R+ + ++RLATG +
Sbjct: 175 FKKAFRMSKSTFELICDELNSAVAKEDTALRNAIPVRQ----RVAVCIWRLATGEPLRLV 230
Query: 144 SDRFGVSVPVARFCVKQLCR----VLCTNFRFWISFPGPNDLPSVSHGFESLSGLPNCCG 199
S +FG+ + V ++C+ VL + W P L ++ FES+SG+PN G
Sbjct: 231 SKKFGLGISTCHKLVLEVCKAIKDVLMPKYLQW---PDDESLRNIRERFESVSGIPNVVG 287
Query: 200 VVFCSRFDV-SPSAATKNAQNHR-----------VAAQIVVDSTCKILSIAAGFLGEKSD 247
++ + + +P + + N R + Q VV+ + G+ G D
Sbjct: 288 SMYTTHIPIIAPKISVASYFNKRHTERNQKTSYSITIQAVVNPKGVFTDLCIGWPGSMPD 347
Query: 248 SQILKASTLCEDIEDGTLLNAPSTDNAGVNQYLVGDSGYPLLPWLMVPFAEAAPGSVEGN 307
++L+ S L + +G LL ++ G G+PLL W++VP+ + +
Sbjct: 348 DKVLEKSLLYQRANNGGLLKG---------MWVAGGPGHPLLDWVLVPYTQQNLTWTQHA 398
Query: 308 FNEVHGLMRLTALRTEASLR-NWGVLSKPVKEEVKMAVAYMGACSILHNSLLMRED 362
FNE ++ A L+ W L K + +++ +GAC +LHN MRE+
Sbjct: 399 FNEKMSEVQGVAKEAFGRLKGRWACLQKRTEVKLQDLPTVLGACCVLHNICEMREE 454
>AT1G72270.2 | Symbols: | LOCATED IN: mitochondrion; EXPRESSED IN:
shoot apex, embryo, flower, seed; EXPRESSED DURING:
petal differentiation and expansion stage, E expanded
cotyledon stage, D bilateral stage; BEST Arabidopsis
thaliana protein match is: PIF / Ping-Pong family of
plant transposases (TAIR:AT3G55350.1). |
chr1:27209890-27211122 REVERSE LENGTH=410
Length = 410
Score = 70.9 bits (172), Expect = 2e-12, Method: Compositional matrix adjust.
Identities = 66/238 (27%), Positives = 102/238 (42%), Gaps = 19/238 (7%)
Query: 131 LFRLATGSDYADISDRFGV-SVPVARFCVKQLCRVLCTNFRFWISFPGPNDLPSVSHGFE 189
+FRLA G+ Y + RFG S A +C+++ + P P+ P++
Sbjct: 127 IFRLAHGASYECLVHRFGFDSTSQASRSFFTVCKLINEKLSQQLDDPKPDFSPNL----- 181
Query: 190 SLSGLPNCCGVVFCSRFDVSPSAATKNAQNHRVAAQIVVDSTCKILSIAAGFLGEKSDSQ 249
LPNC GVV RF+V + Q +VDS + + I+AG+
Sbjct: 182 ----LPNCYGVVGFGRFEVKGKLLGAKGS---ILVQALVDSNGRFVDISAGWPSTMKPEA 234
Query: 250 ILKASTLCEDIEDGTLLNAPSTDNAGV--NQYLVGDSGYPLLPWLMVPF-AEAAPGSVEG 306
I + + L I + L AP+ GV +Y++GDS PLLPWL+ P+ + S
Sbjct: 235 IFRQTKLFS-IAEEVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVTPYDLTSDEESFRE 293
Query: 307 NFNE-VHGLMRLTALRTEASLRNWGVLSKPVK-EEVKMAVAYMGACSILHNSLLMRED 362
FN VH + + W +L K K E ++ + +LHN L+ D
Sbjct: 294 EFNNVVHTGLHSVEIAFAKVRARWRILDKKWKPETIEFMPFVITTGCLLHNFLVNSGD 351
>AT1G72270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Ribosome 60S
biogenesis N-terminal (InterPro:IPR021714); BEST
Arabidopsis thaliana protein match is: unknown protein
(TAIR:AT4G27010.1); Has 772 Blast hits to 657 proteins
in 120 species: Archae - 0; Bacteria - 0; Metazoa - 344;
Fungi - 94; Plants - 322; Viruses - 0; Other Eukaryotes
- 12 (source: NCBI BLink). | chr1:27199733-27211122
REVERSE LENGTH=2845
Length = 2845
Score = 69.7 bits (169), Expect = 3e-12, Method: Compositional matrix adjust.
Identities = 65/234 (27%), Positives = 101/234 (43%), Gaps = 19/234 (8%)
Query: 131 LFRLATGSDYADISDRFGV-SVPVARFCVKQLCRVLCTNFRFWISFPGPNDLPSVSHGFE 189
+FRLA G+ Y + RFG S A +C+++ + P P+ P++
Sbjct: 127 IFRLAHGASYECLVHRFGFDSTSQASRSFFTVCKLINEKLSQQLDDPKPDFSPNL----- 181
Query: 190 SLSGLPNCCGVVFCSRFDVSPSAATKNAQNHRVAAQIVVDSTCKILSIAAGFLGEKSDSQ 249
LPNC GVV RF+V + Q +VDS + + I+AG+
Sbjct: 182 ----LPNCYGVVGFGRFEVKGKLLGAKGS---ILVQALVDSNGRFVDISAGWPSTMKPEA 234
Query: 250 ILKASTLCEDIEDGTLLNAPSTDNAGV--NQYLVGDSGYPLLPWLMVPF-AEAAPGSVEG 306
I + + L I + L AP+ GV +Y++GDS PLLPWL+ P+ + S
Sbjct: 235 IFRQTKLF-SIAEEVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVTPYDLTSDEESFRE 293
Query: 307 NFNE-VHGLMRLTALRTEASLRNWGVLSKPVK-EEVKMAVAYMGACSILHNSLL 358
FN VH + + W +L K K E ++ + +LHN L+
Sbjct: 294 EFNNVVHTGLHSVEIAFAKVRARWRILDKKWKPETIEFMPFVITTGCLLHNFLV 347
>AT3G19120.1 | Symbols: | PIF / Ping-Pong family of plant
transposases | chr3:6609678-6611018 REVERSE LENGTH=446
Length = 446
Score = 65.1 bits (157), Expect = 8e-11, Method: Compositional matrix adjust.
Identities = 74/298 (24%), Positives = 124/298 (41%), Gaps = 33/298 (11%)
Query: 77 NPPPRSPDWFRASFLMTSSTFEWLSGLLEPLLDCRDPADLLPLNLSSGVRLGIGLFRLAT 136
+ P R W R+ + ++ F + L+P + + L+L + + + L RLA
Sbjct: 109 DAPLRDARW-RSLYGLSYPVFITVVDKLKPFITASN------LSLPADYAVAMVLSRLAH 161
Query: 137 GSDYADISDRFGVSVPVARFCVKQLCRVLCTN-FRFWISFP-GPNDLPSVSHGFESLSGL 194
G ++ R+ + + + R+L T + +I P G L + GFE L+ L
Sbjct: 162 GCSAKTLASRYSLDPYLISKITNMVTRLLATKLYPEFIKIPVGKRRLIETTQGFEELTSL 221
Query: 195 PNCCGVVFCSRFDVSPSAATKNAQ-NHR-----------VAAQIVVDSTCKILSIAAGFL 242
PN CG + D +P + + N R V Q+V D +
Sbjct: 222 PNICGAI-----DSTPVKLRRRTKLNPRNIYGCKYGYDAVLLQVVADHKKIFWDVCVKAP 276
Query: 243 GEKSDSQILKASTLCEDIEDGTLLNAPSTDNAG--VNQYLVGDSGYPLLPWLMVPFAEAA 300
G + DS + S L + + G ++ + G V Y+VGD YPLL +LM PF+
Sbjct: 277 GGEDDSSHFRDSLLYKRLTSGDIVWEKVINIRGHHVRPYIVGDWCYPLLSFLMTPFSPNG 336
Query: 301 PGSVEGNFNEVHGLMRLTALRTEAS---LRNWGVLSKPVKEEVKMAVAYMGACSILHN 355
G+ N + LM+ ++ EA W +L + + V A + AC +LHN
Sbjct: 337 SGTPPENLFD-GMLMKGRSVVVEAIGLLKARWKIL-QSLNVGVNHAPQTIVACCVLHN 392