Miyakogusa Predicted Gene
- Lj3g3v3165180.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj3g3v3165180.1 Non Chatacterized Hit- tr|I1IJ83|I1IJ83_BRADI
Uncharacterized protein OS=Brachypodium distachyon
GN=,43.01,0.0000000000002,UNCHARACTERIZED,Harbinger
transposase-derived nuclease; DDE_4,NULL,CUFF.45365.1
(388 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G55350.1 | Symbols: | PIF / Ping-Pong family of plant transp... 376 e-104
AT3G63270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative h... 303 1e-82
AT5G12010.1 | Symbols: | unknown protein; INVOLVED IN: response... 124 2e-28
AT4G29780.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 103 3e-22
AT3G19120.1 | Symbols: | PIF / Ping-Pong family of plant transp... 97 1e-20
AT1G72270.2 | Symbols: | LOCATED IN: mitochondrion; EXPRESSED I... 89 8e-18
AT1G72270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Ribosome 6... 86 3e-17
AT5G41980.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative h... 61 1e-09
>AT3G55350.1 | Symbols: | PIF / Ping-Pong family of plant
transposases | chr3:20518518-20520690 FORWARD LENGTH=406
Length = 406
Score = 376 bits (966), Expect = e-104, Method: Compositional matrix adjust.
Identities = 183/355 (51%), Positives = 246/355 (69%), Gaps = 9/355 (2%)
Query: 30 LDWSDEFSEKINGRHQNTPT--SVFKITKSMFDYICSLVEEDMMEKPPHLAFTNGQPVSL 87
LDW D FS +I G + T SVFKI++ FDYICSLV+ D KP + + +NG P+SL
Sbjct: 52 LDWWDGFSRRIYGGSTDPKTFESVFKISRKTFDYICSLVKADFTAKPANFSDSNGNPLSL 111
Query: 88 HDQVAVALRRLGSGDSLVLIGGIFGVSNTMVSQITWKFVESMENRGLLHLKWPPTERELI 147
+D+VAVALRRLGSG+SL +IG FG++ + VSQITW+FVESME R + HL WP +L
Sbjct: 112 NDRVAVALRRLGSGESLSVIGETFGMNQSTVSQITWRFVESMEERAIHHLSWP---SKLD 168
Query: 148 EIKSKFGKLQSLPNCCGVVDVTHINMCLASTEPNKDVWLDHENKHSMVLQAIVDPDMRFR 207
EIKSKF K+ LPNCCG +D+THI M L + EP+ VWLD E SM LQA+VDPDMRF
Sbjct: 169 EIKSKFEKISGLPNCCGAIDITHIVMNLPAVEPSNKVWLDGEKNFSMTLQAVVDPDMRFL 228
Query: 208 DIVTGWPGQMKDWMVFEDSTFHKLCDEGERLNGNIIRLPDGSEIREYIIGDSGYPLLPYL 267
D++ GWPG + D +V ++S F+KL ++G+RLNG + L + +E+REYI+GDSG+PLLP+L
Sbjct: 229 DVIAGWPGSLNDDVVLKNSGFYKLVEKGKRLNGEKLPLSERTELREYIVGDSGFPLLPWL 288
Query: 268 IVPYKGEEQELSVSQANFNRRHLETRMVAKRALVMLKEMWGIIQGTMWRPNKHRLSKVIL 327
+ PY+G + S+ Q FN+RH E A+ AL LK+ W II G MW P+++RL ++I
Sbjct: 289 LTPYQG--KPTSLPQTEFNKRHSEATKAAQMALSKLKDRWRIINGVMWMPDRNRLPRIIF 346
Query: 328 VCCILHNIVIDMGDRVQNEQLSNLPTNHDPGYHQLICEAEDPQGVLLRENVSRYL 382
VCC+LHNI+IDM D+ ++Q L HD Y Q C+ D +LR+ +S L
Sbjct: 347 VCCLLHNIIIDMEDQTLDDQ--PLSQQHDMNYRQRSCKLADEASSVLRDELSDQL 399
>AT3G63270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative
harbinger transposase-derived nuclease
(InterPro:IPR006912); BEST Arabidopsis thaliana protein
match is: PIF / Ping-Pong family of plant transposases
(TAIR:AT3G55350.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr3:23375932-23377398 REVERSE LENGTH=396
Length = 396
Score = 303 bits (777), Expect = 1e-82, Method: Compositional matrix adjust.
Identities = 156/358 (43%), Positives = 217/358 (60%), Gaps = 12/358 (3%)
Query: 31 DWSDEF-----SEKINGRHQNTPTSVFKITKSMFDYICSLVEEDMMEKPPH-LAFTNGQP 84
DW D F S + F+ +K+ F YICSLV ED++ +PP L G+
Sbjct: 43 DWWDTFWLRNSSPSVPSDEDYAFKHFFRASKTTFSYICSLVREDLISRPPSGLINIEGRL 102
Query: 85 VSLHDQVAVALRRLGSGDSLVLIGGIFGVSNTMVSQITWKFVESMENRGLLHLKWPPTER 144
+S+ QVA+ALRRL SGDS V +G FGV + VSQ+TW+F+E++E R HL+WP ++R
Sbjct: 103 LSVEKQVAIALRRLASGDSQVSVGAAFGVGQSTVSQVTWRFIEALEERAKHHLRWPDSDR 162
Query: 145 ELIEIKSKFGKLQSLPNCCGVVDVTHINMCLASTEPNKDVWLDHENKHSMVLQAIVDPDM 204
+ EIKSKF ++ LPNCCG +D THI M L + + + D W D E +SM LQ + D +M
Sbjct: 163 -IEEIKSKFEEMYGLPNCCGAIDTTHIIMTLPAVQASDD-WCDQEKNYSMFLQGVFDHEM 220
Query: 205 RFRDIVTGWPGQMKDWMVFEDSTFHKLCDEGERLNGNIIRLPDGSEIREYIIGDSGYPLL 264
RF ++VTGWPG M + + S F KLC+ + L+GN L G++IREY++G YPLL
Sbjct: 221 RFLNMVTGWPGGMTVSKLLKFSGFFKLCENAQILDGNPKTLSQGAQIREYVVGGISYPLL 280
Query: 265 PYLIVPYKGEEQELSVSQANFNRRHLETRMVAKRALVMLKEMWGIIQGTMWRPNKHRLSK 324
P+LI P+ + S S FN RH + R VA A LK W I+ MWRP++ +L
Sbjct: 281 PWLITPHDSDHP--SDSMVAFNERHEKVRSVAATAFQQLKGSWRILSKVMWRPDRRKLPS 338
Query: 325 VILVCCILHNIVIDMGDRVQNEQLSNLPTNHDPGYHQLICEAEDPQGVLLRENVSRYL 382
+ILVCC+LHNI+ID GD +Q + L +HD GY C+ +P G LR ++ +L
Sbjct: 339 IILVCCLLHNIIIDCGDYLQED--VPLSGHHDSGYADRYCKQTEPLGSELRGCLTEHL 394
>AT5G12010.1 | Symbols: | unknown protein; INVOLVED IN: response to
salt stress; LOCATED IN: chloroplast, plasma membrane,
membrane; EXPRESSED IN: 23 plant structures; EXPRESSED
DURING: 13 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT4G29780.1);
Has 1807 Blast hits to 1807 proteins in 277 species:
Archae - 0; Bacteria - 0; Metazoa - 736; Fungi - 347;
Plants - 385; Viruses - 0; Other Eukaryotes - 339
(source: NCBI BLink). | chr5:3877975-3879483 REVERSE
LENGTH=502
Length = 502
Score = 124 bits (310), Expect = 2e-28, Method: Compositional matrix adjust.
Identities = 84/291 (28%), Positives = 143/291 (49%), Gaps = 26/291 (8%)
Query: 51 VFKITKSMFDYICSLVEEDMMEKPPHLAFTNGQPVSLHDQVAVALRRLGSGDSLVLIGGI 110
F+++KS F+ IC + + ++ A N PV +VAV + RL +G+ L L+
Sbjct: 178 AFRMSKSTFELICDELNSAVAKE--DTALRNAIPV--RQRVAVCIWRLATGEPLRLVSKK 233
Query: 111 FGVSNTMVSQITWKFVESMENRGL-LHLKWPPTERELIEIKSKFGKLQSLPNCCGVVDVT 169
FG+ + ++ + +++++ + +L+WP E L I+ +F + +PN G + T
Sbjct: 234 FGLGISTCHKLVLEVCKAIKDVLMPKYLQWPDDE-SLRNIRERFESVSGIPNVVGSMYTT 292
Query: 170 HI-----NMCLASTEPNKDVWLDHENKHSMVLQAIVDPDMRFRDIVTGWPGQMKDWMVFE 224
HI + +AS + + + +S+ +QA+V+P F D+ GWPG M D V E
Sbjct: 293 HIPIIAPKISVASYFNKRHTERNQKTSYSITIQAVVNPKGVFTDLCIGWPGSMPDDKVLE 352
Query: 225 DSTFHKLCDEGERLNGNIIRLPDGSEIREYIIGDSGYPLLPYLIVPYKGEEQELSVSQAN 284
S ++ + G L G ++ G G+PLL +++VPY +Q L+ +Q
Sbjct: 353 KSLLYQRANNGGLLKGM------------WVAGGPGHPLLDWVLVPYT--QQNLTWTQHA 398
Query: 285 FNRRHLETRMVAKRALVMLKEMWGIIQGTMWRPNKHRLSKVILVCCILHNI 335
FN + E + VAK A LK W +Q L V+ CC+LHNI
Sbjct: 399 FNEKMSEVQGVAKEAFGRLKGRWACLQKRT-EVKLQDLPTVLGACCVLHNI 448
>AT4G29780.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT5G12010.1); Has 945 Blast hits to 944 proteins
in 87 species: Archae - 0; Bacteria - 0; Metazoa - 519;
Fungi - 43; Plants - 365; Viruses - 0; Other Eukaryotes
- 18 (source: NCBI BLink). | chr4:14579859-14581481
FORWARD LENGTH=540
Length = 540
Score = 103 bits (256), Expect = 3e-22, Method: Compositional matrix adjust.
Identities = 82/318 (25%), Positives = 145/318 (45%), Gaps = 28/318 (8%)
Query: 24 WGQEGPLDWSDEFSEKINGRHQNTPTSVFKITKSMFDYICSLVEEDMMEKPPHLAFTNGQ 83
W +E DW D S ++ F+++KS F+ IC ++ + +K L
Sbjct: 191 WVKERTTDWWDRVSRP--DFPEDEFRREFRMSKSTFNLICEELDTTVTKKNTMLRDAIPA 248
Query: 84 PVSLHDQVAVALRRLGSGDSLVLIGGIFGVSNTMVSQITWKFVESMENRGL-LHLKWPPT 142
P +V V + RL +G L + FG+ + ++ + ++ + + +L WP +
Sbjct: 249 P----KRVGVCVWRLATGAPLRHVSERFGLGISTCHKLVIEVCRAIYDVLMPKYLLWP-S 303
Query: 143 ERELIEIKSKFGKLQSLPNCCGVVDVTHINMC-----LASTEPNKDVWLDHENKHSMVLQ 197
+ E+ K+KF + +PN G + THI + +A+ + + + +S+ +Q
Sbjct: 304 DSEINSTKAKFESVHKIPNVVGSIYTTHIPIIAPKVHVAAYFNKRHTERNQKTSYSITVQ 363
Query: 198 AIVDPDMRFRDIVTGWPGQMKDWMVFEDSTFHKLCDEGERLNGNIIRLPDGSEIREYIIG 257
+V+ D F D+ G PG + D + E S+ + +R ++R +I+G
Sbjct: 364 GVVNADGIFTDVCIGNPGSLTDDQILEKSSLSR-----QRAARGMLR-------DSWIVG 411
Query: 258 DSGYPLLPYLIVPYKGEEQELSVSQANFNRRHLETRMVAKRALVMLKEMWGIIQGTMWRP 317
+SG+PL YL+VPY Q L+ +Q FN E + +A A LK W +Q
Sbjct: 412 NSGFPLTDYLLVPYT--RQNLTWTQHAFNESIGEIQGIATAAFERLKGRWACLQKRT-EV 468
Query: 318 NKHRLSKVILVCCILHNI 335
L V+ CC+LHNI
Sbjct: 469 KLQDLPYVLGACCVLHNI 486
>AT3G19120.1 | Symbols: | PIF / Ping-Pong family of plant
transposases | chr3:6609678-6611018 REVERSE LENGTH=446
Length = 446
Score = 97.4 bits (241), Expect = 1e-20, Method: Compositional matrix adjust.
Identities = 74/261 (28%), Positives = 120/261 (45%), Gaps = 9/261 (3%)
Query: 79 FTNGQPVSLHDQVAVA--LRRLGSGDSLVLIGGIFGVSNTMVSQITWKFVESMENRGLL- 135
F +SL AVA L RL G S + + + ++S+IT V + L
Sbjct: 138 FITASNLSLPADYAVAMVLSRLAHGCSAKTLASRYSLDPYLISKIT-NMVTRLLATKLYP 196
Query: 136 -HLKWPPTERELIEIKSKFGKLQSLPNCCGVVDVTHINMCLASTEPNKDVWLDHENKHSM 194
+K P +R LIE F +L SLPN CG +D T + + + ++++ ++
Sbjct: 197 EFIKIPVGKRRLIETTQGFEELTSLPNICGAIDSTPVKLRRRTKLNPRNIYGCKYGYDAV 256
Query: 195 VLQAIVDPDMRFRDIVTGWPGQMKDWMVFEDSTFHKLCDEGERLNGNIIRLPDGSEIREY 254
+LQ + D F D+ PG D F DS +K G+ + +I + G +R Y
Sbjct: 257 LLQVVADHKKIFWDVCVKAPGGEDDSSHFRDSLLYKRLTSGDIVWEKVINI-RGHHVRPY 315
Query: 255 IIGDSGYPLLPYLIVPYKGEEQELSVSQANFNRRHLETRMVAKRALVMLKEMWGIIQGTM 314
I+GD YPLL +L+ P+ + + F+ ++ R V A+ +LK W I+Q
Sbjct: 316 IVGDWCYPLLSFLMTPFSPNGSG-TPPENLFDGMLMKGRSVVVEAIGLLKARWKILQSLN 374
Query: 315 WRPNKHRLSKVILVCCILHNI 335
N + I+ CC+LHN+
Sbjct: 375 VGVNHA--PQTIVACCVLHNL 393
>AT1G72270.2 | Symbols: | LOCATED IN: mitochondrion; EXPRESSED IN:
shoot apex, embryo, flower, seed; EXPRESSED DURING:
petal differentiation and expansion stage, E expanded
cotyledon stage, D bilateral stage; BEST Arabidopsis
thaliana protein match is: PIF / Ping-Pong family of
plant transposases (TAIR:AT3G55350.1). |
chr1:27209890-27211122 REVERSE LENGTH=410
Length = 410
Score = 88.6 bits (218), Expect = 8e-18, Method: Compositional matrix adjust.
Identities = 74/253 (29%), Positives = 112/253 (44%), Gaps = 24/253 (9%)
Query: 90 QVAVALRRLGSGDSLVLIGGIFGVSNTMVSQITWKFVESMENRGLLHLKWPPTERELIEI 149
A + RL G S + FG +T + ++ V + N L ++L +
Sbjct: 122 SFAATIFRLAHGASYECLVHRFGFDSTSQASRSFFTVCKLINEKL--------SQQLDDP 173
Query: 150 KSKFGKLQSLPNCCGVVDVTHINMCLASTEPNKDVWLDHENKHSMVLQAIVDPDMRFRDI 209
K F LPNC GVV + K L K S+++QA+VD + RF DI
Sbjct: 174 KPDFSP-NLLPNCYGVVGFGRFEV--------KGKLLGA--KGSILVQALVDSNGRFVDI 222
Query: 210 VTGWPGQMKDWMVFEDSTFHKLCDEGERLNGNIIRLPDGSEIREYIIGDSGYPLLPYLIV 269
GWP MK +F + + + E L+G +L +G + YI+GDS PLLP+L+
Sbjct: 223 SAGWPSTMKPEAIFRQTKLFSIAE--EVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVT 280
Query: 270 PYKGEEQELSVSQANFNRRHLETRMVAKRALVMLKEMWGIIQGTMWRPNK-HRLSKVILV 328
PY E S + N H V + A ++ W I+ W+P + VI
Sbjct: 281 PYDLTSDEESFREEFNNVVHTGLHSV-EIAFAKVRARWRILD-KKWKPETIEFMPFVITT 338
Query: 329 CCILHNIVIDMGD 341
C+LHN +++ GD
Sbjct: 339 GCLLHNFLVNSGD 351
>AT1G72270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Ribosome 60S
biogenesis N-terminal (InterPro:IPR021714); BEST
Arabidopsis thaliana protein match is: unknown protein
(TAIR:AT4G27010.1); Has 772 Blast hits to 657 proteins
in 120 species: Archae - 0; Bacteria - 0; Metazoa - 344;
Fungi - 94; Plants - 322; Viruses - 0; Other Eukaryotes
- 12 (source: NCBI BLink). | chr1:27199733-27211122
REVERSE LENGTH=2845
Length = 2845
Score = 86.3 bits (212), Expect = 3e-17, Method: Compositional matrix adjust.
Identities = 74/251 (29%), Positives = 112/251 (44%), Gaps = 24/251 (9%)
Query: 92 AVALRRLGSGDSLVLIGGIFGVSNTMVSQITWKFVESMENRGLLHLKWPPTERELIEIKS 151
A + RL G S + FG +T + ++ V + N L ++L + K
Sbjct: 124 AATIFRLAHGASYECLVHRFGFDSTSQASRSFFTVCKLINEKL--------SQQLDDPKP 175
Query: 152 KFGKLQSLPNCCGVVDVTHINMCLASTEPNKDVWLDHENKHSMVLQAIVDPDMRFRDIVT 211
F LPNC GVV + K L K S+++QA+VD + RF DI
Sbjct: 176 DFSP-NLLPNCYGVVGFGRFEV--------KGKLLG--AKGSILVQALVDSNGRFVDISA 224
Query: 212 GWPGQMKDWMVFEDSTFHKLCDEGERLNGNIIRLPDGSEIREYIIGDSGYPLLPYLIVPY 271
GWP MK +F + + + E L+G +L +G + YI+GDS PLLP+L+ PY
Sbjct: 225 GWPSTMKPEAIFRQTKLFSIAE--EVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVTPY 282
Query: 272 KGEEQELSVSQANFNRRHLETRMVAKRALVMLKEMWGIIQGTMWRPNK-HRLSKVILVCC 330
E S + N H V + A ++ W I+ W+P + VI C
Sbjct: 283 DLTSDEESFREEFNNVVHTGLHSV-EIAFAKVRARWRILDKK-WKPETIEFMPFVITTGC 340
Query: 331 ILHNIVIDMGD 341
+LHN +++ GD
Sbjct: 341 LLHNFLVNSGD 351
>AT5G41980.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative
harbinger transposase-derived nuclease
(InterPro:IPR006912); BEST Arabidopsis thaliana protein
match is: unknown protein (TAIR:AT1G43722.1); Has 1807
Blast hits to 1807 proteins in 277 species: Archae - 0;
Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
Viruses - 0; Other Eukaryotes - 339 (source: NCBI
BLink). | chr5:16793765-16794889 FORWARD LENGTH=374
Length = 374
Score = 61.2 bits (147), Expect = 1e-09, Method: Compositional matrix adjust.
Identities = 48/182 (26%), Positives = 76/182 (41%), Gaps = 24/182 (13%)
Query: 159 LPNCCGVVDVTHINMCLASTEPNKDVWLDHENKHSMVLQAIVDP---DMRFRDIVTGWPG 215
+C GVVD HI + + E N + ++ Q ++ D+RF ++ GW G
Sbjct: 140 FKDCVGVVDSFHIPVMVGVDEQGP-----FRNGNGLLTQNVLAASSFDLRFNYVLAGWEG 194
Query: 216 QMKDWMVFEDSTFHKLCDEGERLNGNIIRLPDGSEIREYIIGDSGYPLLPYLIVPYKG-E 274
D V + + N +++P G +Y I D+ YP LP I PY G
Sbjct: 195 SASDQQVLNAALTRR----------NKLQVPQG----KYYIVDNKYPNLPGFIAPYHGVS 240
Query: 275 EQELSVSQANFNRRHLETRMVAKRALVMLKEMWGIIQGTMWRPNKHRLSKVILVCCILHN 334
++ FN RH R LKE + I+ P + ++ K+++ C LHN
Sbjct: 241 TNSREEAKEMFNERHKLLHRAIHRTFGALKERFPILLSAPPYPLQTQV-KLVIAACALHN 299
Query: 335 IV 336
V
Sbjct: 300 YV 301