Miyakogusa Predicted Gene
- Lj0g3v0219279.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj0g3v0219279.1 tr|B6UDA2|B6UDA2_MAIZE Harpin-induced protein
OS=Zea mays PE=2 SV=1,27.23,6e-19,seg,NULL; LEA_2,Late embryogenesis
abundant protein, LEA-14,NODE_45710_length_1044_cov_36.521072.path1.1
(237 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT4G26490.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 220 8e-58
AT5G56050.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 205 2e-53
AT3G26350.1 | Symbols: | LOCATED IN: chloroplast; EXPRESSED IN:... 118 3e-27
AT1G13050.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 114 6e-26
AT1G13050.2 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 109 2e-24
AT5G22870.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 59 4e-09
AT3G11650.1 | Symbols: NHL2 | NDR1/HIN1-like 2 | chr3:3676264-36... 58 5e-09
AT5G06320.1 | Symbols: NHL3 | NDR1/HIN1-like 3 | chr5:1931016-19... 57 1e-08
AT4G05220.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 56 2e-08
AT1G08160.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 52 2e-07
AT2G35460.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 52 4e-07
AT2G35980.1 | Symbols: YLS9, NHL10, ATNHL10 | Late embryogenesis... 49 2e-06
>AT4G26490.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr4:13380425-13381231 FORWARD LENGTH=268
Length = 268
Score = 220 bits (560), Expect = 8e-58, Method: Compositional matrix adjust.
Identities = 97/193 (50%), Positives = 144/193 (74%)
Query: 43 LWQPRHQKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYF 102
L QPR +T IW A C +FSL+LIFF IATLI++L+ +PR P+FDIPNA+L+ +YF
Sbjct: 74 LRQPRSSRTSLWIWCVAGFCFVFSLLLIFFAIATLIVFLAIRPRIPVFDIPNANLHTIYF 133
Query: 103 DSTSYLNGEFAFLANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQS 162
D+ + NG+ + L NFTNPN +I V+FE L+IELFF +RLI++Q ++PF Q+ ETRL+
Sbjct: 134 DTPEFFNGDLSMLVNFTNPNKKIEVKFEKLRIELFFFNRLIAAQVVQPFLQKKHETRLEP 193
Query: 163 VKLMSNLMFLPQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHIHLSHWLHSVCHIEMNG 222
++L+S+L+ LP + V+L+ Q+++N++ Y RGT+KV+ + G IH S+ LH C ++M G
Sbjct: 194 IRLISSLVGLPVNHAVELRRQLENNKIEYEIRGTFKVKAHFGMIHYSYQLHGRCQLQMTG 253
Query: 223 PPNGVLLARECTT 235
PP G+L++R CTT
Sbjct: 254 PPTGILISRNCTT 266
>AT5G56050.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast; BEST Arabidopsis thaliana protein match is:
Late embryogenesis abundant (LEA) hydroxyproline-rich
glycoprotein family (TAIR:AT4G26490.1); Has 1807 Blast
hits to 1807 proteins in 277 species: Archae - 0;
Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
Viruses - 0; Other Eukaryotes - 339 (source: NCBI
BLink). | chr5:22701167-22702018 REVERSE LENGTH=283
Length = 283
Score = 205 bits (522), Expect = 2e-53, Method: Compositional matrix adjust.
Identities = 96/196 (48%), Positives = 137/196 (69%)
Query: 42 ALWQPRHQKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVY 101
L Q R +T P IW A LC IFS++LI FGIATLI+YL+ KPR P+FDI NA LN +
Sbjct: 87 VLLQLRTSRTNPWIWCGAALCFIFSILLIVFGIATLILYLAVKPRTPVFDISNAKLNTIL 146
Query: 102 FDSTSYLNGEFAFLANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQ 161
F+S Y NG+ NFTNPN +++VRFE+L +EL+F+D I++Q + PF+QR +TRL+
Sbjct: 147 FESPVYFNGDMLLQLNFTNPNKKLNVRFENLMVELWFADTKIATQGVLPFSQRNGKTRLE 206
Query: 162 SVKLMSNLMFLPQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHIHLSHWLHSVCHIEMN 221
++L+SNL+FLP + ++L+ QV SNR+ Y R ++V+ G IH S+ LH +C ++++
Sbjct: 207 PIRLISNLVFLPVNHILELRRQVTSNRIAYEIRSNFRVKAIFGMIHYSYMLHGICQLQLS 266
Query: 222 GPPNGVLLARECTTWR 237
PP G L+ R CTT R
Sbjct: 267 SPPAGGLVYRNCTTKR 282
>AT3G26350.1 | Symbols: | LOCATED IN: chloroplast; EXPRESSED IN:
root, pedicel, carpel, stamen; EXPRESSED DURING: 4
anthesis, petal differentiation and expansion stage;
CONTAINS InterPro DOMAIN/s: Late embryogenesis abundant
protein, group 2 (InterPro:IPR004864); BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT1G13050.1); Has 3534 Blast hits to 2704 proteins
in 342 species: Archae - 6; Bacteria - 192; Metazoa -
1076; Fungi - 505; Plants - 1162; Viruses - 224; Other
Eukaryotes - 369 (source: NCBI BLink). |
chr3:9653660-9654730 REVERSE LENGTH=356
Length = 356
Score = 118 bits (295), Expect = 3e-27, Method: Compositional matrix adjust.
Identities = 70/193 (36%), Positives = 110/193 (56%), Gaps = 1/193 (0%)
Query: 46 PRHQKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDST 105
P ++T + W AA C IF ++LI G+ LI+YL ++PR+P DI A+LN Y D
Sbjct: 164 PPSRETNAMTWSAAFCCAIFWVILILGGLIILIVYLVYRPRSPYVDISAANLNAAYLDMG 223
Query: 106 SYLNGEFAFLANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKL 165
LNG+ LAN TNP+ + V F + EL++ + LI++Q I+PF + + +V L
Sbjct: 224 FLLNGDLTILANVTNPSKKSSVEFSYVTFELYYYNTLIATQYIEPFKVPKKTSMFANVHL 283
Query: 166 MSNLMFLPQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHI-HLSHWLHSVCHIEMNGPP 224
+S+ + L +LQ Q+++ V RG + R ++G + S+ LH+ C + +NGPP
Sbjct: 284 VSSQVQLQATQSRELQRQIETGPVLLNLRGMFHARSHIGPLFRYSYKLHTHCSVSLNGPP 343
Query: 225 NGVLLARECTTWR 237
G + AR C T R
Sbjct: 344 LGAMRARRCNTKR 356
>AT1G13050.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT3G26350.1); Has 538 Blast hits to 510 proteins
in 88 species: Archae - 0; Bacteria - 23; Metazoa - 81;
Fungi - 36; Plants - 361; Viruses - 8; Other Eukaryotes
- 29 (source: NCBI BLink). | chr1:4450568-4451521
FORWARD LENGTH=317
Length = 317
Score = 114 bits (285), Expect = 6e-26, Method: Compositional matrix adjust.
Identities = 70/190 (36%), Positives = 107/190 (56%), Gaps = 1/190 (0%)
Query: 49 QKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYL 108
++TKP+ A + C I +VLI G+ L++YL+ +PR+P FDI A+LN D L
Sbjct: 128 KRTKPMTLPATICCAILLIVLILSGLILLLVYLANRPRSPYFDISAATLNTANLDMGYVL 187
Query: 109 NGEFAFLANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKLMSN 168
NG+ A + NFTNP+ + V F + EL+F + LI+++ I+PF + S L+S+
Sbjct: 188 NGDLAVVVNFTNPSKKSSVDFSYVMFELYFYNTLIATEHIEPFIVPKGMSMFTSFHLVSS 247
Query: 169 LMFLPQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHI-HLSHWLHSVCHIEMNGPPNGV 227
+ + LQ Q+ + V RGT+ R N+G + S+WLH+ C I +N PP G
Sbjct: 248 QVQIQMIQSQDLQLQLGTGPVLLNLRGTFHARSNLGSLMRYSYWLHTQCSISLNTPPAGT 307
Query: 228 LLARECTTWR 237
+ AR C T R
Sbjct: 308 MRARRCNTKR 317
>AT1G13050.2 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: endomembrane
system; EXPRESSED IN: 14 plant structures; EXPRESSED
DURING: 9 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT3G26350.1);
Has 260 Blast hits to 259 proteins in 20 species: Archae
- 0; Bacteria - 0; Metazoa - 0; Fungi - 0; Plants - 260;
Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink).
| chr1:4450964-4451521 FORWARD LENGTH=185
Length = 185
Score = 109 bits (273), Expect = 2e-24, Method: Compositional matrix adjust.
Identities = 59/160 (36%), Positives = 89/160 (55%), Gaps = 1/160 (0%)
Query: 79 IYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLANFTNPNSRIHVRFESLKIELFF 138
+YL+ +PR+P FDI A+LN D LNG+ A + NFTNP+ + V F + EL+F
Sbjct: 26 VYLANRPRSPYFDISAATLNTANLDMGYVLNGDLAVVVNFTNPSKKSSVDFSYVMFELYF 85
Query: 139 SDRLISSQSIKPFTQRPRETRLQSVKLMSNLMFLPQDVGVKLQGQVQSNRVNYYARGTYK 198
+ LI+++ I+PF + S L+S+ + + LQ Q+ + V RGT+
Sbjct: 86 YNTLIATEHIEPFIVPKGMSMFTSFHLVSSQVQIQMIQSQDLQLQLGTGPVLLNLRGTFH 145
Query: 199 VRFNMGHI-HLSHWLHSVCHIEMNGPPNGVLLARECTTWR 237
R N+G + S+WLH+ C I +N PP G + AR C T R
Sbjct: 146 ARSNLGSLMRYSYWLHTQCSISLNTPPAGTMRARRCNTKR 185
>AT5G22870.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr5:7647056-7647679 REVERSE LENGTH=207
Length = 207
Score = 58.5 bits (140), Expect = 4e-09, Method: Compositional matrix adjust.
Identities = 38/155 (24%), Positives = 76/155 (49%), Gaps = 3/155 (1%)
Query: 59 AVLCLIFSLVLIFFGIAT---LIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFL 115
+++C IF ++L +A LI +L KP+ + + NAS+ + ++++ F F
Sbjct: 24 SLICYIFLVILTLIFMAAVGFLITWLETKPKKLRYTVENASVQNFNLTNDNHMSATFQFT 83
Query: 116 ANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKLMSNLMFLPQD 175
NPN RI V + S++I + F D+ ++ +++PF Q + L++ + + +
Sbjct: 84 IQSHNPNHRISVYYSSVEIFVKFKDQTLAFDTVEPFHQPRMNVKQIDETLIAENVAVSKS 143
Query: 176 VGVKLQGQVQSNRVNYYARGTYKVRFNMGHIHLSH 210
G L+ Q ++ + +VRF +G SH
Sbjct: 144 NGKDLRSQNSLGKIGFEVFVKARVRFKVGIWKSSH 178
>AT3G11650.1 | Symbols: NHL2 | NDR1/HIN1-like 2 |
chr3:3676264-3676986 REVERSE LENGTH=240
Length = 240
Score = 58.2 bits (139), Expect = 5e-09, Method: Compositional matrix adjust.
Identities = 40/148 (27%), Positives = 73/148 (49%), Gaps = 7/148 (4%)
Query: 59 AVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLANF 118
+++C I V + G+A LI++L F+P F + +A+LN FD + L ++ NF
Sbjct: 53 SLICNILIAVAVILGVAALILWLIFRPNAVKFYVADANLNRFSFDPNNNL--HYSLDLNF 110
Query: 119 T--NPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKLMS-NLMFLPQD 175
T NPN R+ V ++ + ++ D+ S ++ F Q + T + K+ NL+ L
Sbjct: 111 TIRNPNQRVGVYYDEFSVSGYYGDQRFGSANVSSFYQGHKNTTVILTKIEGQNLVVLGDG 170
Query: 176 VGVKLQGQVQSN--RVNYYARGTYKVRF 201
L+ +S R+N R + + +F
Sbjct: 171 ARTDLKDDEKSGIYRINAKLRLSVRFKF 198
>AT5G06320.1 | Symbols: NHL3 | NDR1/HIN1-like 3 |
chr5:1931016-1931711 REVERSE LENGTH=231
Length = 231
Score = 57.0 bits (136), Expect = 1e-08, Method: Compositional matrix adjust.
Identities = 47/154 (30%), Positives = 74/154 (48%), Gaps = 9/154 (5%)
Query: 60 VLCLIFSLVL---IFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLA 116
+L +IF++++ + GIA LII+L F+P F + +A L D T+ L +
Sbjct: 44 ILSVIFNILITIAVLLGIAALIIWLIFRPNAIKFHVTDAKLTEFTLDPTNNL--RYNLDL 101
Query: 117 NFT--NPNSRIHVRFESLKIELFFSD-RLISSQSIKPFTQRPRETRLQSVKLMS-NLMFL 172
NFT NPN RI V ++ +++ ++ D R S +I F Q + T + KL+ L+ L
Sbjct: 102 NFTIRNPNRRIGVYYDEIEVRGYYGDQRFGMSNNISKFYQGHKNTTVVGTKLVGQQLVLL 161
Query: 173 PQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHI 206
L V S A+ K+RF G I
Sbjct: 162 DGGERKDLNEDVNSQIYRIDAKLRLKIRFKFGLI 195
>AT4G05220.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr4:2685104-2685784 REVERSE LENGTH=226
Length = 226
Score = 55.8 bits (133), Expect = 2e-08, Method: Compositional matrix adjust.
Identities = 41/173 (23%), Positives = 78/173 (45%), Gaps = 5/173 (2%)
Query: 61 LCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLANFTN 120
+C +F LVL F G+ I++LS +P P F I + + + T N AF N
Sbjct: 46 ICAMFLLVLFFVGVIAFILWLSLRPHRPRFHIQDFVVQGLD-QPTGVENARIAFNVTILN 104
Query: 121 PNSRIHVRFESLKIELFFSDRLISSQS-IKPFTQRPRETRLQSVKLMSNLMFLPQDVGVK 179
PN + V F+S++ +++ D+ + + PF Q+P T + + L + + + +
Sbjct: 105 PNQHMGVYFDSMEGSIYYKDQRVGLIPLLNPFFQQPTNTTIVTGTLTGASLTVNSNRWTE 164
Query: 180 LQGQVQSNRVNYYARGTYKVRFNMGH-IHLSHWLHSVCHIEMNGPPNGVLLAR 231
V + +RF + I H +H+ C+I + +G++L +
Sbjct: 165 FSNDRAQGTVGFRLDIVSTIRFKLHRWISKHHRMHANCNIVVGR--DGLILPK 215
>AT1G08160.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr1:2559672-2560337 REVERSE LENGTH=221
Length = 221
Score = 52.4 bits (124), Expect = 2e-07, Method: Compositional matrix adjust.
Identities = 32/127 (25%), Positives = 66/127 (51%), Gaps = 5/127 (3%)
Query: 78 IIYLSFKPRNPIFDIPNASLNVVYF-DSTSYLNGEFAFLANFTNPNSRIHVRFESLKIEL 136
I YL+ +P+ I+ + AS+ ++ ++N +F+++ NP + VR+ S++I
Sbjct: 57 ITYLTLRPKRLIYTVEAASVQEFAIGNNDDHINAKFSYVIKSYNPEKHVSVRYHSMRIST 116
Query: 137 FFSDRLISSQSIKPFTQRPR-ETRLQSVKLMSNLM---FLPQDVGVKLQGQVQSNRVNYY 192
++ ++ ++I PF QRP+ ETR+++ + N+ F +D+ + V
Sbjct: 117 AHHNQSVAHKNISPFKQRPKNETRIETQLVSHNVALSKFNARDLRAEKSKGTIEMEVYIT 176
Query: 193 ARGTYKV 199
AR +YK
Sbjct: 177 ARVSYKT 183
>AT2G35460.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr2:14905788-14906504 FORWARD LENGTH=238
Length = 238
Score = 52.0 bits (123), Expect = 4e-07, Method: Compositional matrix adjust.
Identities = 35/146 (23%), Positives = 69/146 (47%), Gaps = 6/146 (4%)
Query: 60 VLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLANFT 119
++C I VL+ G+ LI++ +P F + A L FD S+ N + NF+
Sbjct: 53 IICNILIGVLVCLGVVALILWFILRPNVVKFQVTEADLTRFEFDPRSH-NLHYNISLNFS 111
Query: 120 --NPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKLMSNLMFLPQDVG 177
NPN R+ + ++ L++ ++ D+ S+ ++ F Q + T + +L + L +G
Sbjct: 112 IRNPNQRLGIHYDQLEVRGYYGDQRFSAANMTSFYQGHKNTTVVGTELNGQKLVL---LG 168
Query: 178 VKLQGQVQSNRVNYYARGTYKVRFNM 203
+ + +R + R K+RF +
Sbjct: 169 AGGRRDFREDRRSGVYRIDVKLRFKL 194
>AT2G35980.1 | Symbols: YLS9, NHL10, ATNHL10 | Late embryogenesis
abundant (LEA) hydroxyproline-rich glycoprotein family |
chr2:15110635-15111318 FORWARD LENGTH=227
Length = 227
Score = 49.3 bits (116), Expect = 2e-06, Method: Compositional matrix adjust.
Identities = 45/189 (23%), Positives = 82/189 (43%), Gaps = 8/189 (4%)
Query: 23 SDEPRNQQYSHTNSTSKLPALWQPR-HQKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYL 81
+++P N + + P + R H + + + +I SL++I G+A LI +L
Sbjct: 3 AEQPLNGAFYGPSVPPPAPKGYYRRGHGRGCGCCLLSLFVKVIISLIVIL-GVAALIFWL 61
Query: 82 SFKPRNPIFDIPNASLNVVYFDSTS---YLNGEFAFLANFTNPNSRIHVRFESLKIELFF 138
+PR F + +ASL FD TS L A NPN RI + ++ ++ ++
Sbjct: 62 IVRPRAIKFHVTDASLT--RFDHTSPDNILRYNLALTVPVRNPNKRIGLYYDRIEAHAYY 119
Query: 139 SDRLISSQSIKPFTQRPRETRLQSVKLMS-NLMFLPQDVGVKLQGQVQSNRVNYYARGTY 197
+ S+ ++ PF Q + T + + NL+ L + S N +
Sbjct: 120 EGKRFSTITLTPFYQGHKNTTVLTPTFQGQNLVIFNAGQSRTLNAERISGVYNIEIKFRL 179
Query: 198 KVRFNMGHI 206
+VRF +G +
Sbjct: 180 RVRFKLGDL 188