Miyakogusa Predicted Gene

Lj2g3v3319320.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj2g3v3319320.1 tr|B6UDA2|B6UDA2_MAIZE Harpin-induced protein
OS=Zea mays PE=2 SV=1,27.23,6e-19,LEA_2,Late embryogenesis abundant
protein, LEA-14; seg,NULL,NODE_45710_length_1044_cov_36.521072.path2.1
         (237 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT4G26490.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...   220   8e-58
AT5G56050.1 | Symbols:  | FUNCTIONS IN: molecular_function unkno...   205   2e-53
AT3G26350.1 | Symbols:  | LOCATED IN: chloroplast; EXPRESSED IN:...   118   3e-27
AT1G13050.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...   114   6e-26
AT1G13050.2 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   109   2e-24
AT5G22870.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...    59   4e-09
AT3G11650.1 | Symbols: NHL2 | NDR1/HIN1-like 2 | chr3:3676264-36...    58   5e-09
AT5G06320.1 | Symbols: NHL3 | NDR1/HIN1-like 3 | chr5:1931016-19...    57   1e-08
AT4G05220.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...    56   2e-08
AT1G08160.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...    52   2e-07
AT2G35460.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...    52   4e-07
AT2G35980.1 | Symbols: YLS9, NHL10, ATNHL10 | Late embryogenesis...    49   2e-06

>AT4G26490.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr4:13380425-13381231 FORWARD LENGTH=268
          Length = 268

 Score =  220 bits (560), Expect = 8e-58,   Method: Compositional matrix adjust.
 Identities = 97/193 (50%), Positives = 144/193 (74%)

Query: 43  LWQPRHQKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYF 102
           L QPR  +T   IW  A  C +FSL+LIFF IATLI++L+ +PR P+FDIPNA+L+ +YF
Sbjct: 74  LRQPRSSRTSLWIWCVAGFCFVFSLLLIFFAIATLIVFLAIRPRIPVFDIPNANLHTIYF 133

Query: 103 DSTSYLNGEFAFLANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQS 162
           D+  + NG+ + L NFTNPN +I V+FE L+IELFF +RLI++Q ++PF Q+  ETRL+ 
Sbjct: 134 DTPEFFNGDLSMLVNFTNPNKKIEVKFEKLRIELFFFNRLIAAQVVQPFLQKKHETRLEP 193

Query: 163 VKLMSNLMFLPQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHIHLSHWLHSVCHIEMNG 222
           ++L+S+L+ LP +  V+L+ Q+++N++ Y  RGT+KV+ + G IH S+ LH  C ++M G
Sbjct: 194 IRLISSLVGLPVNHAVELRRQLENNKIEYEIRGTFKVKAHFGMIHYSYQLHGRCQLQMTG 253

Query: 223 PPNGVLLARECTT 235
           PP G+L++R CTT
Sbjct: 254 PPTGILISRNCTT 266


>AT5G56050.1 | Symbols:  | FUNCTIONS IN: molecular_function unknown;
           INVOLVED IN: biological_process unknown; LOCATED IN:
           chloroplast; BEST Arabidopsis thaliana protein match is:
           Late embryogenesis abundant (LEA) hydroxyproline-rich
           glycoprotein family (TAIR:AT4G26490.1); Has 1807 Blast
           hits to 1807 proteins in 277 species: Archae - 0;
           Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
           Viruses - 0; Other Eukaryotes - 339 (source: NCBI
           BLink). | chr5:22701167-22702018 REVERSE LENGTH=283
          Length = 283

 Score =  205 bits (522), Expect = 2e-53,   Method: Compositional matrix adjust.
 Identities = 96/196 (48%), Positives = 137/196 (69%)

Query: 42  ALWQPRHQKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVY 101
            L Q R  +T P IW  A LC IFS++LI FGIATLI+YL+ KPR P+FDI NA LN + 
Sbjct: 87  VLLQLRTSRTNPWIWCGAALCFIFSILLIVFGIATLILYLAVKPRTPVFDISNAKLNTIL 146

Query: 102 FDSTSYLNGEFAFLANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQ 161
           F+S  Y NG+     NFTNPN +++VRFE+L +EL+F+D  I++Q + PF+QR  +TRL+
Sbjct: 147 FESPVYFNGDMLLQLNFTNPNKKLNVRFENLMVELWFADTKIATQGVLPFSQRNGKTRLE 206

Query: 162 SVKLMSNLMFLPQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHIHLSHWLHSVCHIEMN 221
            ++L+SNL+FLP +  ++L+ QV SNR+ Y  R  ++V+   G IH S+ LH +C ++++
Sbjct: 207 PIRLISNLVFLPVNHILELRRQVTSNRIAYEIRSNFRVKAIFGMIHYSYMLHGICQLQLS 266

Query: 222 GPPNGVLLARECTTWR 237
            PP G L+ R CTT R
Sbjct: 267 SPPAGGLVYRNCTTKR 282


>AT3G26350.1 | Symbols:  | LOCATED IN: chloroplast; EXPRESSED IN:
           root, pedicel, carpel, stamen; EXPRESSED DURING: 4
           anthesis, petal differentiation and expansion stage;
           CONTAINS InterPro DOMAIN/s: Late embryogenesis abundant
           protein, group 2 (InterPro:IPR004864); BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT1G13050.1); Has 3534 Blast hits to 2704 proteins
           in 342 species: Archae - 6; Bacteria - 192; Metazoa -
           1076; Fungi - 505; Plants - 1162; Viruses - 224; Other
           Eukaryotes - 369 (source: NCBI BLink). |
           chr3:9653660-9654730 REVERSE LENGTH=356
          Length = 356

 Score =  118 bits (295), Expect = 3e-27,   Method: Compositional matrix adjust.
 Identities = 70/193 (36%), Positives = 110/193 (56%), Gaps = 1/193 (0%)

Query: 46  PRHQKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDST 105
           P  ++T  + W AA  C IF ++LI  G+  LI+YL ++PR+P  DI  A+LN  Y D  
Sbjct: 164 PPSRETNAMTWSAAFCCAIFWVILILGGLIILIVYLVYRPRSPYVDISAANLNAAYLDMG 223

Query: 106 SYLNGEFAFLANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKL 165
             LNG+   LAN TNP+ +  V F  +  EL++ + LI++Q I+PF    + +   +V L
Sbjct: 224 FLLNGDLTILANVTNPSKKSSVEFSYVTFELYYYNTLIATQYIEPFKVPKKTSMFANVHL 283

Query: 166 MSNLMFLPQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHI-HLSHWLHSVCHIEMNGPP 224
           +S+ + L      +LQ Q+++  V    RG +  R ++G +   S+ LH+ C + +NGPP
Sbjct: 284 VSSQVQLQATQSRELQRQIETGPVLLNLRGMFHARSHIGPLFRYSYKLHTHCSVSLNGPP 343

Query: 225 NGVLLARECTTWR 237
            G + AR C T R
Sbjct: 344 LGAMRARRCNTKR 356


>AT1G13050.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT3G26350.1); Has 538 Blast hits to 510 proteins
           in 88 species: Archae - 0; Bacteria - 23; Metazoa - 81;
           Fungi - 36; Plants - 361; Viruses - 8; Other Eukaryotes
           - 29 (source: NCBI BLink). | chr1:4450568-4451521
           FORWARD LENGTH=317
          Length = 317

 Score =  114 bits (285), Expect = 6e-26,   Method: Compositional matrix adjust.
 Identities = 70/190 (36%), Positives = 107/190 (56%), Gaps = 1/190 (0%)

Query: 49  QKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYL 108
           ++TKP+   A + C I  +VLI  G+  L++YL+ +PR+P FDI  A+LN    D    L
Sbjct: 128 KRTKPMTLPATICCAILLIVLILSGLILLLVYLANRPRSPYFDISAATLNTANLDMGYVL 187

Query: 109 NGEFAFLANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKLMSN 168
           NG+ A + NFTNP+ +  V F  +  EL+F + LI+++ I+PF      +   S  L+S+
Sbjct: 188 NGDLAVVVNFTNPSKKSSVDFSYVMFELYFYNTLIATEHIEPFIVPKGMSMFTSFHLVSS 247

Query: 169 LMFLPQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHI-HLSHWLHSVCHIEMNGPPNGV 227
            + +       LQ Q+ +  V    RGT+  R N+G +   S+WLH+ C I +N PP G 
Sbjct: 248 QVQIQMIQSQDLQLQLGTGPVLLNLRGTFHARSNLGSLMRYSYWLHTQCSISLNTPPAGT 307

Query: 228 LLARECTTWR 237
           + AR C T R
Sbjct: 308 MRARRCNTKR 317


>AT1G13050.2 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: endomembrane
           system; EXPRESSED IN: 14 plant structures; EXPRESSED
           DURING: 9 growth stages; BEST Arabidopsis thaliana
           protein match is: unknown protein (TAIR:AT3G26350.1);
           Has 260 Blast hits to 259 proteins in 20 species: Archae
           - 0; Bacteria - 0; Metazoa - 0; Fungi - 0; Plants - 260;
           Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink).
           | chr1:4450964-4451521 FORWARD LENGTH=185
          Length = 185

 Score =  109 bits (273), Expect = 2e-24,   Method: Compositional matrix adjust.
 Identities = 59/160 (36%), Positives = 89/160 (55%), Gaps = 1/160 (0%)

Query: 79  IYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLANFTNPNSRIHVRFESLKIELFF 138
           +YL+ +PR+P FDI  A+LN    D    LNG+ A + NFTNP+ +  V F  +  EL+F
Sbjct: 26  VYLANRPRSPYFDISAATLNTANLDMGYVLNGDLAVVVNFTNPSKKSSVDFSYVMFELYF 85

Query: 139 SDRLISSQSIKPFTQRPRETRLQSVKLMSNLMFLPQDVGVKLQGQVQSNRVNYYARGTYK 198
            + LI+++ I+PF      +   S  L+S+ + +       LQ Q+ +  V    RGT+ 
Sbjct: 86  YNTLIATEHIEPFIVPKGMSMFTSFHLVSSQVQIQMIQSQDLQLQLGTGPVLLNLRGTFH 145

Query: 199 VRFNMGHI-HLSHWLHSVCHIEMNGPPNGVLLARECTTWR 237
            R N+G +   S+WLH+ C I +N PP G + AR C T R
Sbjct: 146 ARSNLGSLMRYSYWLHTQCSISLNTPPAGTMRARRCNTKR 185


>AT5G22870.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr5:7647056-7647679 REVERSE LENGTH=207
          Length = 207

 Score = 58.5 bits (140), Expect = 4e-09,   Method: Compositional matrix adjust.
 Identities = 38/155 (24%), Positives = 76/155 (49%), Gaps = 3/155 (1%)

Query: 59  AVLCLIFSLVLIFFGIAT---LIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFL 115
           +++C IF ++L    +A    LI +L  KP+   + + NAS+      + ++++  F F 
Sbjct: 24  SLICYIFLVILTLIFMAAVGFLITWLETKPKKLRYTVENASVQNFNLTNDNHMSATFQFT 83

Query: 116 ANFTNPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKLMSNLMFLPQD 175
               NPN RI V + S++I + F D+ ++  +++PF Q     +     L++  + + + 
Sbjct: 84  IQSHNPNHRISVYYSSVEIFVKFKDQTLAFDTVEPFHQPRMNVKQIDETLIAENVAVSKS 143

Query: 176 VGVKLQGQVQSNRVNYYARGTYKVRFNMGHIHLSH 210
            G  L+ Q    ++ +      +VRF +G    SH
Sbjct: 144 NGKDLRSQNSLGKIGFEVFVKARVRFKVGIWKSSH 178


>AT3G11650.1 | Symbols: NHL2 | NDR1/HIN1-like 2 |
           chr3:3676264-3676986 REVERSE LENGTH=240
          Length = 240

 Score = 58.2 bits (139), Expect = 5e-09,   Method: Compositional matrix adjust.
 Identities = 40/148 (27%), Positives = 73/148 (49%), Gaps = 7/148 (4%)

Query: 59  AVLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLANF 118
           +++C I   V +  G+A LI++L F+P    F + +A+LN   FD  + L   ++   NF
Sbjct: 53  SLICNILIAVAVILGVAALILWLIFRPNAVKFYVADANLNRFSFDPNNNL--HYSLDLNF 110

Query: 119 T--NPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKLMS-NLMFLPQD 175
           T  NPN R+ V ++   +  ++ D+   S ++  F Q  + T +   K+   NL+ L   
Sbjct: 111 TIRNPNQRVGVYYDEFSVSGYYGDQRFGSANVSSFYQGHKNTTVILTKIEGQNLVVLGDG 170

Query: 176 VGVKLQGQVQSN--RVNYYARGTYKVRF 201
               L+   +S   R+N   R + + +F
Sbjct: 171 ARTDLKDDEKSGIYRINAKLRLSVRFKF 198


>AT5G06320.1 | Symbols: NHL3 | NDR1/HIN1-like 3 |
           chr5:1931016-1931711 REVERSE LENGTH=231
          Length = 231

 Score = 57.0 bits (136), Expect = 1e-08,   Method: Compositional matrix adjust.
 Identities = 47/154 (30%), Positives = 74/154 (48%), Gaps = 9/154 (5%)

Query: 60  VLCLIFSLVL---IFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLA 116
           +L +IF++++   +  GIA LII+L F+P    F + +A L     D T+ L   +    
Sbjct: 44  ILSVIFNILITIAVLLGIAALIIWLIFRPNAIKFHVTDAKLTEFTLDPTNNL--RYNLDL 101

Query: 117 NFT--NPNSRIHVRFESLKIELFFSD-RLISSQSIKPFTQRPRETRLQSVKLMS-NLMFL 172
           NFT  NPN RI V ++ +++  ++ D R   S +I  F Q  + T +   KL+   L+ L
Sbjct: 102 NFTIRNPNRRIGVYYDEIEVRGYYGDQRFGMSNNISKFYQGHKNTTVVGTKLVGQQLVLL 161

Query: 173 PQDVGVKLQGQVQSNRVNYYARGTYKVRFNMGHI 206
                  L   V S      A+   K+RF  G I
Sbjct: 162 DGGERKDLNEDVNSQIYRIDAKLRLKIRFKFGLI 195


>AT4G05220.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr4:2685104-2685784 REVERSE LENGTH=226
          Length = 226

 Score = 55.8 bits (133), Expect = 2e-08,   Method: Compositional matrix adjust.
 Identities = 41/173 (23%), Positives = 78/173 (45%), Gaps = 5/173 (2%)

Query: 61  LCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLANFTN 120
           +C +F LVL F G+   I++LS +P  P F I +  +  +    T   N   AF     N
Sbjct: 46  ICAMFLLVLFFVGVIAFILWLSLRPHRPRFHIQDFVVQGLD-QPTGVENARIAFNVTILN 104

Query: 121 PNSRIHVRFESLKIELFFSDRLISSQS-IKPFTQRPRETRLQSVKLMSNLMFLPQDVGVK 179
           PN  + V F+S++  +++ D+ +     + PF Q+P  T + +  L    + +  +   +
Sbjct: 105 PNQHMGVYFDSMEGSIYYKDQRVGLIPLLNPFFQQPTNTTIVTGTLTGASLTVNSNRWTE 164

Query: 180 LQGQVQSNRVNYYARGTYKVRFNMGH-IHLSHWLHSVCHIEMNGPPNGVLLAR 231
                    V +       +RF +   I   H +H+ C+I +    +G++L +
Sbjct: 165 FSNDRAQGTVGFRLDIVSTIRFKLHRWISKHHRMHANCNIVVGR--DGLILPK 215


>AT1G08160.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr1:2559672-2560337 REVERSE LENGTH=221
          Length = 221

 Score = 52.4 bits (124), Expect = 2e-07,   Method: Compositional matrix adjust.
 Identities = 32/127 (25%), Positives = 66/127 (51%), Gaps = 5/127 (3%)

Query: 78  IIYLSFKPRNPIFDIPNASLNVVYF-DSTSYLNGEFAFLANFTNPNSRIHVRFESLKIEL 136
           I YL+ +P+  I+ +  AS+      ++  ++N +F+++    NP   + VR+ S++I  
Sbjct: 57  ITYLTLRPKRLIYTVEAASVQEFAIGNNDDHINAKFSYVIKSYNPEKHVSVRYHSMRIST 116

Query: 137 FFSDRLISSQSIKPFTQRPR-ETRLQSVKLMSNLM---FLPQDVGVKLQGQVQSNRVNYY 192
              ++ ++ ++I PF QRP+ ETR+++  +  N+    F  +D+  +         V   
Sbjct: 117 AHHNQSVAHKNISPFKQRPKNETRIETQLVSHNVALSKFNARDLRAEKSKGTIEMEVYIT 176

Query: 193 ARGTYKV 199
           AR +YK 
Sbjct: 177 ARVSYKT 183


>AT2G35460.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr2:14905788-14906504 FORWARD LENGTH=238
          Length = 238

 Score = 52.0 bits (123), Expect = 4e-07,   Method: Compositional matrix adjust.
 Identities = 35/146 (23%), Positives = 69/146 (47%), Gaps = 6/146 (4%)

Query: 60  VLCLIFSLVLIFFGIATLIIYLSFKPRNPIFDIPNASLNVVYFDSTSYLNGEFAFLANFT 119
           ++C I   VL+  G+  LI++   +P    F +  A L    FD  S+ N  +    NF+
Sbjct: 53  IICNILIGVLVCLGVVALILWFILRPNVVKFQVTEADLTRFEFDPRSH-NLHYNISLNFS 111

Query: 120 --NPNSRIHVRFESLKIELFFSDRLISSQSIKPFTQRPRETRLQSVKLMSNLMFLPQDVG 177
             NPN R+ + ++ L++  ++ D+  S+ ++  F Q  + T +   +L    + L   +G
Sbjct: 112 IRNPNQRLGIHYDQLEVRGYYGDQRFSAANMTSFYQGHKNTTVVGTELNGQKLVL---LG 168

Query: 178 VKLQGQVQSNRVNYYARGTYKVRFNM 203
              +   + +R +   R   K+RF +
Sbjct: 169 AGGRRDFREDRRSGVYRIDVKLRFKL 194


>AT2G35980.1 | Symbols: YLS9, NHL10, ATNHL10 | Late embryogenesis
           abundant (LEA) hydroxyproline-rich glycoprotein family |
           chr2:15110635-15111318 FORWARD LENGTH=227
          Length = 227

 Score = 49.3 bits (116), Expect = 2e-06,   Method: Compositional matrix adjust.
 Identities = 45/189 (23%), Positives = 82/189 (43%), Gaps = 8/189 (4%)

Query: 23  SDEPRNQQYSHTNSTSKLPALWQPR-HQKTKPVIWFAAVLCLIFSLVLIFFGIATLIIYL 81
           +++P N  +   +     P  +  R H +       +  + +I SL++I  G+A LI +L
Sbjct: 3   AEQPLNGAFYGPSVPPPAPKGYYRRGHGRGCGCCLLSLFVKVIISLIVIL-GVAALIFWL 61

Query: 82  SFKPRNPIFDIPNASLNVVYFDSTS---YLNGEFAFLANFTNPNSRIHVRFESLKIELFF 138
             +PR   F + +ASL    FD TS    L    A      NPN RI + ++ ++   ++
Sbjct: 62  IVRPRAIKFHVTDASLT--RFDHTSPDNILRYNLALTVPVRNPNKRIGLYYDRIEAHAYY 119

Query: 139 SDRLISSQSIKPFTQRPRETRLQSVKLMS-NLMFLPQDVGVKLQGQVQSNRVNYYARGTY 197
             +  S+ ++ PF Q  + T + +      NL+         L  +  S   N   +   
Sbjct: 120 EGKRFSTITLTPFYQGHKNTTVLTPTFQGQNLVIFNAGQSRTLNAERISGVYNIEIKFRL 179

Query: 198 KVRFNMGHI 206
           +VRF +G +
Sbjct: 180 RVRFKLGDL 188