Miyakogusa Predicted Gene
- Lj5g3v0998840.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj5g3v0998840.1 Non Chatacterized Hit- tr|I1IJ83|I1IJ83_BRADI
Uncharacterized protein OS=Brachypodium distachyon
GN=,29.94,0.00000000000004,UNCHARACTERIZED,Harbinger
transposase-derived nuclease; coiled-coil,NULL; seg,NULL;
DDE_4,NULL,NODE_70855_length_1398_cov_16.856939.path2.1
(394 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT3G63270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative h... 535 e-152
AT3G55350.1 | Symbols: | PIF / Ping-Pong family of plant transp... 343 2e-94
AT5G12010.1 | Symbols: | unknown protein; INVOLVED IN: response... 155 3e-38
AT4G29780.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 129 3e-30
AT3G19120.1 | Symbols: | PIF / Ping-Pong family of plant transp... 103 2e-22
AT1G72270.2 | Symbols: | LOCATED IN: mitochondrion; EXPRESSED I... 89 4e-18
AT1G72270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Ribosome 6... 88 1e-17
>AT3G63270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Putative
harbinger transposase-derived nuclease
(InterPro:IPR006912); BEST Arabidopsis thaliana protein
match is: PIF / Ping-Pong family of plant transposases
(TAIR:AT3G55350.1); Has 30201 Blast hits to 17322
proteins in 780 species: Archae - 12; Bacteria - 1396;
Metazoa - 17338; Fungi - 3422; Plants - 5037; Viruses -
0; Other Eukaryotes - 2996 (source: NCBI BLink). |
chr3:23375932-23377398 REVERSE LENGTH=396
Length = 396
Score = 535 bits (1377), Expect = e-152, Method: Compositional matrix adjust.
Identities = 253/368 (68%), Positives = 296/368 (80%), Gaps = 4/368 (1%)
Query: 24 VNPVSVEPRTSETDWWESFWHKNSTAPGYSVSGDEEEGFKYFFRVSKTTFEYICSLVRQD 83
VN V ++P + DWW++FW +NS+ SV DE+ FK+FFR SKTTF YICSLVR+D
Sbjct: 30 VNAVPLDPEAIDCDWWDTFWLRNSSP---SVPSDEDYAFKHFFRASKTTFSYICSLVRED 86
Query: 84 LISRPPSGLINIEGRLLSVEKQVAIALRRLASGESQVSVGASFGVGQSTVSQVTWRFIEA 143
LISRPPSGLINIEGRLLSVEKQVAIALRRLASG+SQVSVGA+FGVGQSTVSQVTWRFIEA
Sbjct: 87 LISRPPSGLINIEGRLLSVEKQVAIALRRLASGDSQVSVGAAFGVGQSTVSQVTWRFIEA 146
Query: 144 LEERATHHLNWPDCNRMQEIKFGFEASYGLPNCCGALDATHIMMTLPAVETSYDWCDQEK 203
LEERA HHL WPD +R++EIK FE YGLPNCCGA+D THI+MTLPAV+ S DWCDQEK
Sbjct: 147 LEERAKHHLRWPDSDRIEEIKSKFEEMYGLPNCCGAIDTTHIIMTLPAVQASDDWCDQEK 206
Query: 204 NYSMLFQGIVDHEMRFIDIMTGLPGGMTFSRFLKCSGFYRLSQNGERLNGNVRTLG-GDV 262
NYSM QG+ DHEMRF++++TG PGGMT S+ LK SGF++L +N + L+GN +TL G
Sbjct: 207 NYSMFLQGVFDHEMRFLNMVTGWPGGMTVSKLLKFSGFFKLCENAQILDGNPKTLSQGAQ 266
Query: 263 IREYVVGGYSYPLLPWLMTPYETNGISDSQSTFNYKHGAARLLAVRAFSLLKGSWRILSK 322
IREYVVGG SYPLLPWL+TP++++ SDS FN +H R +A AF LKGSWRILSK
Sbjct: 267 IREYVVGGISYPLLPWLITPHDSDHPSDSMVAFNERHEKVRSVAATAFQQLKGSWRILSK 326
Query: 323 VMWRPDKRKLPSIILTCCLLHNIVIDCGDTLHPDVALSAHHDSGYQEQYCKQVDPSGRTM 382
VMWRPD+RKLPSIIL CCLLHNI+IDCGD L DV LS HHDSGY ++YCKQ +P G +
Sbjct: 327 VMWRPDRRKLPSIILVCCLLHNIIIDCGDYLQEDVPLSGHHDSGYADRYCKQTEPLGSEL 386
Query: 383 RENLARHL 390
R L HL
Sbjct: 387 RGCLTEHL 394
>AT3G55350.1 | Symbols: | PIF / Ping-Pong family of plant
transposases | chr3:20518518-20520690 FORWARD LENGTH=406
Length = 406
Score = 343 bits (879), Expect = 2e-94, Method: Compositional matrix adjust.
Identities = 170/359 (47%), Positives = 228/359 (63%), Gaps = 17/359 (4%)
Query: 37 DWWESFWHK---NSTAPGYSVSGDEEEGFKYFFRVSKTTFEYICSLVRQDLISRPPSGLI 93
DWW+ F + ST P + F+ F++S+ TF+YICSLV+ D ++P +
Sbjct: 53 DWWDGFSRRIYGGSTDP---------KTFESVFKISRKTFDYICSLVKADFTAKP-ANFS 102
Query: 94 NIEGRLLSVEKQVAIALRRLASGESQVSVGASFGVGQSTVSQVTWRFIEALEERATHHLN 153
+ G LS+ +VA+ALRRL SGES +G +FG+ QSTVSQ+TWRF+E++EERA HHL+
Sbjct: 103 DSNGNPLSLNDRVAVALRRLGSGESLSVIGETFGMNQSTVSQITWRFVESMEERAIHHLS 162
Query: 154 WPDCNRMQEIKFGFEASYGLPNCCGALDATHIMMTLPAVETSYD-WCDQEKNYSMLFQGI 212
WP +++ EIK FE GLPNCCGA+D THI+M LPAVE S W D EKN+SM Q +
Sbjct: 163 WP--SKLDEIKSKFEKISGLPNCCGAIDITHIVMNLPAVEPSNKVWLDGEKNFSMTLQAV 220
Query: 213 VDHEMRFIDIMTGLPGGMTFSRFLKCSGFYRLSQNGERLNGNVRTLGGDV-IREYVVGGY 271
VD +MRF+D++ G PG + LK SGFY+L + G+RLNG L +REY+VG
Sbjct: 221 VDPDMRFLDVIAGWPGSLNDDVVLKNSGFYKLVEKGKRLNGEKLPLSERTELREYIVGDS 280
Query: 272 SYPLLPWLMTPYETNGISDSQSTFNYKHGAARLLAVRAFSLLKGSWRILSKVMWRPDKRK 331
+PLLPWL+TPY+ S Q+ FN +H A A A S LK WRI++ VMW PD+ +
Sbjct: 281 GFPLLPWLLTPYQGKPTSLPQTEFNKRHSEATKAAQMALSKLKDRWRIINGVMWMPDRNR 340
Query: 332 LPSIILTCCLLHNIVIDCGDTLHPDVALSAHHDSGYQEQYCKQVDPSGRTMRENLARHL 390
LP II CCLLHNI+ID D D LS HD Y+++ CK D + +R+ L+ L
Sbjct: 341 LPRIIFVCCLLHNIIIDMEDQTLDDQPLSQQHDMNYRQRSCKLADEASSVLRDELSDQL 399
>AT5G12010.1 | Symbols: | unknown protein; INVOLVED IN: response to
salt stress; LOCATED IN: chloroplast, plasma membrane,
membrane; EXPRESSED IN: 23 plant structures; EXPRESSED
DURING: 13 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT4G29780.1);
Has 1807 Blast hits to 1807 proteins in 277 species:
Archae - 0; Bacteria - 0; Metazoa - 736; Fungi - 347;
Plants - 385; Viruses - 0; Other Eukaryotes - 339
(source: NCBI BLink). | chr5:3877975-3879483 REVERSE
LENGTH=502
Length = 502
Score = 155 bits (393), Expect = 3e-38, Method: Compositional matrix adjust.
Identities = 100/341 (29%), Positives = 169/341 (49%), Gaps = 26/341 (7%)
Query: 59 EEGFKYFFRVSKTTFEYICSLVRQDLISRPPSGLINIEGRLLSVEKQVAIALRRLASGES 118
EE FK FR+SK+TFE IC + +++ + L N + V ++VA+ + RLA+GE
Sbjct: 172 EEDFKKAFRMSKSTFELICDEL-NSAVAKEDTALRNA----IPVRQRVAVCIWRLATGEP 226
Query: 119 QVSVGASFGVGQSTVSQVTWRFIEALEE-RATHHLNWPDCNRMQEIKFGFEASYGLPNCC 177
V FG+G ST ++ +A+++ +L WPD ++ I+ FE+ G+PN
Sbjct: 227 LRLVSKKFGLGISTCHKLVLEVCKAIKDVLMPKYLQWPDDESLRNIRERFESVSGIPNVV 286
Query: 178 GALDATHIMMTLPAVETS------YDWCDQEKNYSMLFQGIVDHEMRFIDIMTGLPGGMT 231
G++ THI + P + + + +Q+ +YS+ Q +V+ + F D+ G PG M
Sbjct: 287 GSMYTTHIPIIAPKISVASYFNKRHTERNQKTSYSITIQAVVNPKGVFTDLCIGWPGSMP 346
Query: 232 FSRFLKCSGFYRLSQNGERLNGNVRTLGGDVIREYVVGGYSYPLLPWLMTPYETNGISDS 291
+ L+ S Y+ + NG L G +V GG +PLL W++ PY ++ +
Sbjct: 347 DDKVLEKSLLYQRANNGGLLKGM-----------WVAGGPGHPLLDWVLVPYTQQNLTWT 395
Query: 292 QSTFNYKHGAARLLAVRAFSLLKGSWRILSKVMWRPDKRKLPSIILTCCLLHNIVIDCGD 351
Q FN K + +A AF LKG W L K + LP+++ CC+LHNI +
Sbjct: 396 QHAFNEKMSEVQGVAKEAFGRLKGRWACLQKRT-EVKLQDLPTVLGACCVLHNICEMREE 454
Query: 352 TLHPDVALSAHHDSGYQEQYCKQVDPSGRTMRENLARHLRH 392
+ P++ + D E + V+ R+ ++ +L H
Sbjct: 455 KMEPELMVEVIDDEVLPENVLRSVN--AMKARDTISHNLLH 493
>AT4G29780.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT5G12010.1); Has 945 Blast hits to 944 proteins
in 87 species: Archae - 0; Bacteria - 0; Metazoa - 519;
Fungi - 43; Plants - 365; Viruses - 0; Other Eukaryotes
- 18 (source: NCBI BLink). | chr4:14579859-14581481
FORWARD LENGTH=540
Length = 540
Score = 129 bits (325), Expect = 3e-30, Method: Compositional matrix adjust.
Identities = 98/371 (26%), Positives = 169/371 (45%), Gaps = 38/371 (10%)
Query: 29 VEPRTSETDWWESFWHKNSTAPGYSVSGDEEEGFKYFFRVSKTTFEYICSLVRQDLISRP 88
V+ RT TDWW+ + P + E+ F+ FR+SK+TF IC + +++
Sbjct: 192 VKERT--TDWWDRV-----SRPDFP-----EDEFRREFRMSKSTFNLICEEL-DTTVTKK 238
Query: 89 PSGLINIEGRLLSVEKQVAIALRRLASGESQVSVGASFGVGQSTVSQVTWRFIEALEE-R 147
+ L + + K+V + + RLA+G V FG+G ST ++ A+ +
Sbjct: 239 NTMLRDA----IPAPKRVGVCVWRLATGAPLRHVSERFGLGISTCHKLVIEVCRAIYDVL 294
Query: 148 ATHHLNWPDCNRMQEIKFGFEASYGLPNCCGALDATHIMMTLPAVETS------YDWCDQ 201
+L WP + + K FE+ + +PN G++ THI + P V + + +Q
Sbjct: 295 MPKYLLWPSDSEINSTKAKFESVHKIPNVVGSIYTTHIPIIAPKVHVAAYFNKRHTERNQ 354
Query: 202 EKNYSMLFQGIVDHEMRFIDIMTGLPGGMTFSRFLKCSGFYRLSQNGERLNGNVRTLGGD 261
+ +YS+ QG+V+ + F D+ G PG +T + L+ S R R G
Sbjct: 355 KTSYSITVQGVVNADGIFTDVCIGNPGSLTDDQILEKSSLSRQ-----------RAARGM 403
Query: 262 VIREYVVGGYSYPLLPWLMTPYETNGISDSQSTFNYKHGAARLLAVRAFSLLKGSWRILS 321
+ ++VG +PL +L+ PY ++ +Q FN G + +A AF LKG W L
Sbjct: 404 LRDSWIVGNSGFPLTDYLLVPYTRQNLTWTQHAFNESIGEIQGIATAAFERLKGRWACLQ 463
Query: 322 KVMWRPDKRKLPSIILTCCLLHNIVIDCGDTLHPDVALSAHHDSGYQEQYCKQVDPSGRT 381
K + LP ++ CC+LHNI + + P++ D E + S
Sbjct: 464 KRT-EVKLQDLPYVLGACCVLHNICEMRKEEMLPELKFEVFDDVAVPENNIRSA--SAVN 520
Query: 382 MRENLARHLRH 392
R++++ +L H
Sbjct: 521 TRDHISHNLLH 531
>AT3G19120.1 | Symbols: | PIF / Ping-Pong family of plant
transposases | chr3:6609678-6611018 REVERSE LENGTH=446
Length = 446
Score = 103 bits (258), Expect = 2e-22, Method: Compositional matrix adjust.
Identities = 73/250 (29%), Positives = 118/250 (47%), Gaps = 6/250 (2%)
Query: 100 LSVEKQVAIALRRLASGESQVSVGASFGVGQSTVSQVTWRFIEALEERA-THHLNWP-DC 157
L + VA+ L RLA G S ++ + + + +S++T L + + P
Sbjct: 146 LPADYAVAMVLSRLAHGCSAKTLASRYSLDPYLISKITNMVTRLLATKLYPEFIKIPVGK 205
Query: 158 NRMQEIKFGFEASYGLPNCCGALDATHIMMTLPAVETSYDWCDQEKNY-SMLFQGIVDHE 216
R+ E GFE LPN CGA+D+T + + + + Y ++L Q + DH+
Sbjct: 206 RRLIETTQGFEELTSLPNICGAIDSTPVKLRRRTKLNPRNIYGCKYGYDAVLLQVVADHK 265
Query: 217 MRFIDIMTGLPGGMTFSRFLKCSGFYRLSQNGERLNGNVRTLGGDVIREYVVGGYSYPLL 276
F D+ PGG S + S Y+ +G+ + V + G +R Y+VG + YPLL
Sbjct: 266 KIFWDVCVKAPGGEDDSSHFRDSLLYKRLTSGDIVWEKVINIRGHHVRPYIVGDWCYPLL 325
Query: 277 PWLMTPYETNGI-SDSQSTFNYKHGAARLLAVRAFSLLKGSWRILSKVMWRPDKRKLPSI 335
+LMTP+ NG + ++ F+ R + V A LLK W+IL + P
Sbjct: 326 SFLMTPFSPNGSGTPPENLFDGMLMKGRSVVVEAIGLLKARWKILQSL--NVGVNHAPQT 383
Query: 336 ILTCCLLHNI 345
I+ CC+LHN+
Sbjct: 384 IVACCVLHNL 393
>AT1G72270.2 | Symbols: | LOCATED IN: mitochondrion; EXPRESSED IN:
shoot apex, embryo, flower, seed; EXPRESSED DURING:
petal differentiation and expansion stage, E expanded
cotyledon stage, D bilateral stage; BEST Arabidopsis
thaliana protein match is: PIF / Ping-Pong family of
plant transposases (TAIR:AT3G55350.1). |
chr1:27209890-27211122 REVERSE LENGTH=410
Length = 410
Score = 89.4 bits (220), Expect = 4e-18, Method: Compositional matrix adjust.
Identities = 79/292 (27%), Positives = 130/292 (44%), Gaps = 45/292 (15%)
Query: 65 FFRVSKTTFEYICSLVRQDLISRPPSGLINIEGRLLSVEKQVAIALRRLASGESQVSVGA 124
+FR+SK+TF + S++ S PS A + RLA G S +
Sbjct: 100 YFRMSKSTFFSLYSILSH---SSLPS---------------FAATIFRLAHGASYECLVH 141
Query: 125 SFGV-GQSTVSQVTWRFIEALEERATHHLNWPDCNRMQEIKFGFEASYGLPNCCGALDAT 183
FG S S+ + + + E+ + L+ P + + LPNC G +
Sbjct: 142 RFGFDSTSQASRSFFTVCKLINEKLSQQLDDPKPDFSPNL---------LPNCYGVVGFG 192
Query: 184 HIMMTLPAVETSYDWCDQEKNYSMLFQGIVDHEMRFIDIMTGLPGGMTFSRFLKCSGFYR 243
+ + S+L Q +VD RF+DI G P M + + +
Sbjct: 193 RFEVKGKLLGA---------KGSILVQALVDSNGRFVDISAGWPSTMKPEAIFRQTKLFS 243
Query: 244 LSQNGERLNGNVRTLG-GDVIREYVVGGYSYPLLPWLMTPYETNGISDS--QSTFNYKHG 300
+++ E L+G LG G ++ Y++G PLLPWL+TPY+ +S + N H
Sbjct: 244 IAE--EVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVTPYDLTSDEESFREEFNNVVHT 301
Query: 301 AARLLAVRAFSLLKGSWRILSKVMWRPDKRK-LPSIILTCCLLHNIVIDCGD 351
+ + AF+ ++ WRIL K W+P+ + +P +I T CLLHN +++ GD
Sbjct: 302 GLHSVEI-AFAKVRARWRILDK-KWKPETIEFMPFVITTGCLLHNFLVNSGD 351
>AT1G72270.1 | Symbols: | CONTAINS InterPro DOMAIN/s: Ribosome 60S
biogenesis N-terminal (InterPro:IPR021714); BEST
Arabidopsis thaliana protein match is: unknown protein
(TAIR:AT4G27010.1); Has 772 Blast hits to 657 proteins
in 120 species: Archae - 0; Bacteria - 0; Metazoa - 344;
Fungi - 94; Plants - 322; Viruses - 0; Other Eukaryotes
- 12 (source: NCBI BLink). | chr1:27199733-27211122
REVERSE LENGTH=2845
Length = 2845
Score = 87.8 bits (216), Expect = 1e-17, Method: Compositional matrix adjust.
Identities = 81/292 (27%), Positives = 131/292 (44%), Gaps = 45/292 (15%)
Query: 65 FFRVSKTTFEYICSLVRQDLISRPPSGLINIEGRLLSVEKQVAIALRRLASGESQVSVGA 124
+FR+SK+TF + S++ S PS A + RLA G S +
Sbjct: 100 YFRMSKSTFFSLYSILSH---SSLPS---------------FAATIFRLAHGASYECLVH 141
Query: 125 SFGV-GQSTVSQVTWRFIEALEERATHHLNWPDCNRMQEIKFGFEASYGLPNCCGALDAT 183
FG S S+ + + + E+ + L+ P K F + LPNC G +
Sbjct: 142 RFGFDSTSQASRSFFTVCKLINEKLSQQLDDP--------KPDFSPNL-LPNCYGVVGFG 192
Query: 184 HIMMTLPAVETSYDWCDQEKNYSMLFQGIVDHEMRFIDIMTGLPGGMTFSRFLKCSGFYR 243
+ + S+L Q +VD RF+DI G P M + + +
Sbjct: 193 RFEVKGKLLGA---------KGSILVQALVDSNGRFVDISAGWPSTMKPEAIFRQTKLFS 243
Query: 244 LSQNGERLNGNVRTLG-GDVIREYVVGGYSYPLLPWLMTPYETNGISDS--QSTFNYKHG 300
+++ E L+G LG G ++ Y++G PLLPWL+TPY+ +S + N H
Sbjct: 244 IAE--EVLSGAPTKLGNGVLVPRYILGDSCLPLLPWLVTPYDLTSDEESFREEFNNVVHT 301
Query: 301 AARLLAVRAFSLLKGSWRILSKVMWRPDKRK-LPSIILTCCLLHNIVIDCGD 351
+ + AF+ ++ WRIL K W+P+ + +P +I T CLLHN +++ GD
Sbjct: 302 GLHSVEI-AFAKVRARWRILDK-KWKPETIEFMPFVITTGCLLHNFLVNSGD 351