Miyakogusa Predicted Gene

Lj3g3v0235130.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj3g3v0235130.1 Non Chatacterized Hit- tr|G7II20|G7II20_MEDTR
Putative uncharacterized protein OS=Medicago
truncatul,85.16,0,LEA_2,Late embryogenesis abundant protein, LEA-14;
seg,NULL,CUFF.40396.1
         (311 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT1G45688.1 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   180   1e-45
AT3G24600.1 | Symbols:  | Late embryogenesis abundant protein, g...   170   9e-43
AT5G42860.1 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   167   7e-42
AT1G45688.2 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   131   6e-31
AT4G35170.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...   119   3e-27
AT2G41990.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Late embry...    96   4e-20
AT3G08490.1 | Symbols:  | BEST Arabidopsis thaliana protein matc...    65   4e-11
AT2G35460.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...    48   7e-06

>AT1G45688.1 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT5G42860.1); Has 258 Blast
           hits to 242 proteins in 39 species: Archae - 0; Bacteria
           - 11; Metazoa - 10; Fungi - 14; Plants - 198; Viruses -
           17; Other Eukaryotes - 8 (source: NCBI BLink). |
           chr1:17191502-17192870 FORWARD LENGTH=342
          Length = 342

 Score =  180 bits (456), Expect = 1e-45,   Method: Compositional matrix adjust.
 Identities = 125/321 (38%), Positives = 165/321 (51%), Gaps = 33/321 (10%)

Query: 3   LSAKSDSDVTSLAXXXXXXXXXXXVYYVQSPSRDSHDGDKSS-SMQATPISNSPMESP-- 59
           + AK+DS+VTSLA           VYYVQSPSRDSHDG+K++ S  +TP+  SPM SP  
Sbjct: 1   MHAKTDSEVTSLAASSPARSPRRPVYYVQSPSRDSHDGEKTATSFHSTPVL-SPMGSPPH 59

Query: 60  SHPSFGRHSRNXXXXXXXXXXXXXXXXXXXXXXXNDKG------WPECDVILEEGSYHEF 113
           SH S GRHSR                          KG      W EC VI EEG   + 
Sbjct: 60  SHSSMGRHSRESSSSRFSGSLKPGSRKVNPNDGSKRKGHGGEKQWKECAVIEEEGLLDDG 119

Query: 114 E-DRAFMRRCQGXXXXXXXXXXXXXXXXXXWGASRPFKAEVAVKSLTVHNLYIGEGSDFT 172
           + D    RRC                    +GA++P K ++ VKS+T   L I  G D  
Sbjct: 120 DRDGGVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKITVKSITFETLKIQAGQDAG 179

Query: 173 GVPTKILTVNSTLRMSIYNPATFFGIHVHSTPINLVFSDISVATGELKKYYQPRKSHRMV 232
           GV T ++T+N+TLRM   N  TFFG+HV STPI+L FS I + +G +KK+YQ RKS R V
Sbjct: 180 GVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKIGSGSVKKFYQGRKSERTV 239

Query: 233 SVNLVGKKVPLYGAGST----------------------ITESQTGVVVVSLKLKFEIVS 270
            V+++G+K+PLYG+GST                      I +       V + L F + S
Sbjct: 240 LVHVIGEKIPLYGSGSTLLPPAPPAPLPKPKKKKGAPVPIPDPPAPPAPVPMTLSFVVRS 299

Query: 271 RGNVVGKLVRTKHHKEITCPL 291
           R  V+GKLV+ K +K+I C +
Sbjct: 300 RAYVLGKLVQPKFYKKIECDI 320


>AT3G24600.1 | Symbols:  | Late embryogenesis abundant protein,
           group 2 | chr3:8972195-8974867 REVERSE LENGTH=506
          Length = 506

 Score =  170 bits (431), Expect = 9e-43,   Method: Compositional matrix adjust.
 Identities = 93/213 (43%), Positives = 129/213 (60%), Gaps = 4/213 (1%)

Query: 98  WPECDVILEEGSYHEFEDRAFMRRCQGXXXXXXXXXXXXXXXXXXWGASRPFKAEVAVKS 157
           WPE    + E   ++      + +C+                   WGAS PF   V+VKS
Sbjct: 291 WPEKPYTINEDEVYDDNRGLSVGQCRAVLVILGTVVVFSVFCSVLWGASHPFSPIVSVKS 350

Query: 158 LTVHNLYIGEGSDFTGVPTKILTVNSTLRMSIYNPATFFGIHVHSTPINLVFSDISVATG 217
           + +H+ Y GEG D TGV TKIL+ NS+++++I +PA +FGIHV S+   L FS +++ATG
Sbjct: 351 VDIHSFYYGEGIDRTGVATKILSFNSSVKVTIDSPAPYFGIHVSSSTFKLTFSALTLATG 410

Query: 218 ELKKYYQPRKSHRMVSVNLVGKKVPLYGAGSTITES-QTGVVVVSLKLKFEIVSRGNVVG 276
           +LK YYQPRKS  +  V L G +VPLYGAG  +  S + G V V  KL+FEI SRGN++G
Sbjct: 411 QLKSYYQPRKSKHISIVKLTGAEVPLYGAGPHLAASDKKGKVPV--KLEFEIRSRGNLLG 468

Query: 277 KLVRTKHHKEITCPLVIDSSG-SKPIKFKKNSC 308
           KLV++KH   ++C   I SS  SKPI+F   +C
Sbjct: 469 KLVKSKHENHVSCSFFISSSKTSKPIEFTHKTC 501



 Score =  135 bits (340), Expect = 4e-32,   Method: Compositional matrix adjust.
 Identities = 95/260 (36%), Positives = 126/260 (48%), Gaps = 25/260 (9%)

Query: 1   MMLSAKSDSDVTSLAXXXXXXXXXXXVYYVQSPSRDSHDGDKSSSMQATPISNSPMESPS 60
           M +  KSDSDVTSL             YYVQSPSRDS   DKSSS+  T    +P ESPS
Sbjct: 1   MKMYPKSDSDVTSL----DLSSPKRPTYYVQSPSRDS---DKSSSVALTTHQTTPTESPS 53

Query: 61  HPSFGRHSRNXXXXXXXXXXXXXXXXXXXXXXXNDKGWPECDVILEEGSYHEFEDRAFMR 120
           HPS      N                           WP      EEG    +ED     
Sbjct: 54  HPSIASRVSNGGGGGFRWKGRRKYHGGIW--------WPADK---EEGGDGRYEDLYEDN 102

Query: 121 R------CQGXXXXXXXXXXXXXXXXXXWGASRPFKAEVAVKSLTVHNLYIGEGSDFTGV 174
           R      C+                   +GAS+     V +K + V + Y GEGSD TGV
Sbjct: 103 RGVSIVTCRLILGVVATLSIFFLLCSVLFGASQSSPPIVYIKGVNVRSFYYGEGSDNTGV 162

Query: 175 PTKILTVNSTLRMSIYNPATFFGIHVHSTPINLVFS-DISVATGELKKYYQPRKSHRMVS 233
           PTKI+ V  ++ ++ +NP+T FGIHV ST ++L++S   ++A   LK Y+QP++S+    
Sbjct: 163 PTKIMNVKCSVVITTHNPSTLFGIHVSSTAVSLIYSRQFTLANARLKSYHQPKQSNHTSR 222

Query: 234 VNLVGKKVPLYGAGSTITES 253
           +NL+G KVPLYGAG+ +  S
Sbjct: 223 INLIGSKVPLYGAGAELVAS 242


>AT5G42860.1 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 19 plant structures; EXPRESSED DURING: 11
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT1G45688.1); Has 1807 Blast
           hits to 1807 proteins in 277 species: Archae - 0;
           Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
           Viruses - 0; Other Eukaryotes - 339 (source: NCBI
           BLink). | chr5:17183339-17184857 REVERSE LENGTH=320
          Length = 320

 Score =  167 bits (424), Expect = 7e-42,   Method: Compositional matrix adjust.
 Identities = 111/313 (35%), Positives = 156/313 (49%), Gaps = 43/313 (13%)

Query: 3   LSAKSDSDVTSLAXXXXXXXXXXXVYYVQSPSRDSHDGDKSS-SMQATPISNSPM--ESP 59
           + AK+DS+VTSL+            Y+VQSPSRDSHDG+K++ S  +TP+  SPM     
Sbjct: 1   MHAKTDSEVTSLSASSPTRSPRRPAYFVQSPSRDSHDGEKTATSFHSTPVLTSPMGSPPH 60

Query: 60  SHPSFGRHSRNXXXXXXXXXXXXXXXXXXXXXXXNDKGWPECDVILEEGSYHE--FEDRA 117
           SH S  R S+                           G  +  +I EEG   +   E  A
Sbjct: 61  SHSSSSRFSK-----------------INGSKRKGHAGEKQFAMIEEEGLLDDGDREQEA 103

Query: 118 FMRRCQGXXXXXXXXXXXXXXXXXXWGASRPFKAEVAVKSLTVHNLYIGEGSDFTGVPTK 177
             RRC                    + A++P K +++VKS+T   L +  G D  G+ T 
Sbjct: 104 LPRRCYVLAFIVGFSLLFAFFSLILYAAAKPQKPKISVKSITFEQLKVQAGQDAGGIGTD 163

Query: 178 ILTVNSTLRMSIYNPATFFGIHVHSTPINLVFSDISVATGELKKYYQPRKSHRMVSVNLV 237
           ++T+N+TLRM   N  TFFG+HV S+PI+L FS I++ +G +KK+YQ RKS R V VN++
Sbjct: 164 MITMNATLRMLYRNTGTFFGVHVTSSPIDLSFSQITIGSGSIKKFYQSRKSQRTVVVNVL 223

Query: 238 GKKVPLYGAGST---------------------ITESQTGVVVVSLKLKFEIVSRGNVVG 276
           G K+PLYG+GST                     I E       V ++L F + SR  V+G
Sbjct: 224 GDKIPLYGSGSTLVPPPPPAPIPKPKKKKGPIVIVEPPAPPAPVPMRLNFTVRSRAYVLG 283

Query: 277 KLVRTKHHKEITC 289
           KLV+ K +K I C
Sbjct: 284 KLVQPKFYKRIVC 296


>AT1G45688.2 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT5G42860.1); Has 35333 Blast
           hits to 34131 proteins in 2444 species: Archae - 798;
           Bacteria - 22429; Metazoa - 974; Fungi - 991; Plants -
           531; Viruses - 0; Other Eukaryotes - 9610 (source: NCBI
           BLink). | chr1:17191502-17192464 FORWARD LENGTH=248
          Length = 248

 Score =  131 bits (330), Expect = 6e-31,   Method: Compositional matrix adjust.
 Identities = 94/239 (39%), Positives = 122/239 (51%), Gaps = 15/239 (6%)

Query: 3   LSAKSDSDVTSLAXXXXXXXXXXXVYYVQSPSRDSHDGDKSS-SMQATPISNSPMESP-- 59
           + AK+DS+VTSLA           VYYVQSPSRDSHDG+K++ S  +TP+  SPM SP  
Sbjct: 1   MHAKTDSEVTSLAASSPARSPRRPVYYVQSPSRDSHDGEKTATSFHSTPVL-SPMGSPPH 59

Query: 60  SHPSFGRHSRNXXXXXXXXXXXXXXXXXXXXXXXNDKG------WPECDVILEEGSYHEF 113
           SH S GRHSR                          KG      W EC VI EEG   + 
Sbjct: 60  SHSSMGRHSRESSSSRFSGSLKPGSRKVNPNDGSKRKGHGGEKQWKECAVIEEEGLLDDG 119

Query: 114 E-DRAFMRRCQGXXXXXXXXXXXXXXXXXXWGASRPFKAEVAVKSLTVHNLYIGEGSDFT 172
           + D    RRC                    +GA++P K ++ VKS+T   L I  G D  
Sbjct: 120 DRDGGVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKITVKSITFETLKIQAGQDAG 179

Query: 173 GVPTKILTVNSTLRMSIYNPATFFGIHVHSTPINLVFSDISVATGE----LKKYYQPRK 227
           GV T ++T+N+TLRM   N  TFFG+HV STPI+L FS I + +G     ++K Y+ R+
Sbjct: 180 GVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKIGSGSVSLPIQKLYRMRE 238


>AT4G35170.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr4:16736839-16738186 FORWARD LENGTH=299
          Length = 299

 Score =  119 bits (297), Expect = 3e-27,   Method: Compositional matrix adjust.
 Identities = 56/153 (36%), Positives = 92/153 (60%), Gaps = 2/153 (1%)

Query: 143 WGASRPFKAEVAVKSLTVHNLYIGEGSDFTGVPTKILTVNSTLRMSIYNPATFFGIHVHS 202
           WG S+ F     +K + + NL +  G+D +GV T +LT+NST+R+   NPATFF +HV S
Sbjct: 129 WGVSKSFAPIATLKEMVLENLNVQSGNDQSGVLTDMLTLNSTVRILYRNPATFFTVHVTS 188

Query: 203 TPINLVFSDISVATGELKKYYQPRKSHRMVSVNLVGKKVPLYGAGSTI--TESQTGVVVV 260
            P+ L +S + +A+G++ ++ Q RKS R++   + G ++PLYG    +    ++   VV+
Sbjct: 189 APLQLSYSQLILASGQMGEFSQRRKSERIIETKVFGDQIPLYGGVPALFGQRAEPDQVVL 248

Query: 261 SLKLKFEIVSRGNVVGKLVRTKHHKEITCPLVI 293
            L L F + +R  V+G+LV+T  H  I C +  
Sbjct: 249 PLNLTFTLRARAYVLGRLVKTTFHSNIKCSITF 281


>AT2G41990.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Late
           embryogenesis abundant protein, group 2
           (InterPro:IPR004864); BEST Arabidopsis thaliana protein
           match is: Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family
           (TAIR:AT4G35170.1); Has 172 Blast hits to 168 proteins
           in 15 species: Archae - 0; Bacteria - 0; Metazoa - 0;
           Fungi - 0; Plants - 172; Viruses - 0; Other Eukaryotes -
           0 (source: NCBI BLink). | chr2:17527396-17528527 FORWARD
           LENGTH=297
          Length = 297

 Score = 95.5 bits (236), Expect = 4e-20,   Method: Compositional matrix adjust.
 Identities = 54/156 (34%), Positives = 95/156 (60%), Gaps = 8/156 (5%)

Query: 143 WGASRPFKAEVAVKSLTVHNLYIGEGSDFTGVPTKILTVNSTLRMSIYNPATFFGIHVHS 202
           WGAS+ +  +V VK + V +L +  G+D +GVPT +L++NST+R+   NP+TFF +HV +
Sbjct: 134 WGASKSYPPKVTVKGMLVRDLNLQAGNDLSGVPTDMLSLNSTVRIYYRNPSTFFAVHVTA 193

Query: 203 TPINLVFSDISVATGELKKYYQPRKSHRMVSVNLVGKKVPLYGAGSTITESQTGVVVVSL 262
           +P+ L +S++ +++GE+ K+   R     V   + G ++PLYG  S   ++      +SL
Sbjct: 194 SPLLLHYSNLLLSSGEMNKFTVGRNGETNVVTVVQGHQIPLYGGVSFHLDT------LSL 247

Query: 263 KLKFEIV--SRGNVVGKLVRTKHHKEITCPLVIDSS 296
            L   IV  S+  ++G+LV +K +  I C   +D++
Sbjct: 248 PLNLTIVLHSKAYILGRLVTSKFYTRIICSFTLDAN 283


>AT3G08490.1 | Symbols:  | BEST Arabidopsis thaliana protein match
           is: Late embryogenesis abundant protein, group 2
           (TAIR:AT3G24600.1); Has 161 Blast hits to 158 proteins
           in 15 species: Archae - 0; Bacteria - 0; Metazoa - 0;
           Fungi - 0; Plants - 161; Viruses - 0; Other Eukaryotes -
           0 (source: NCBI BLink). | chr3:2574105-2575125 REVERSE
           LENGTH=271
          Length = 271

 Score = 65.5 bits (158), Expect = 4e-11,   Method: Compositional matrix adjust.
 Identities = 37/149 (24%), Positives = 69/149 (46%)

Query: 145 ASRPFKAEVAVKSLTVHNLYIGEGSDFTGVPTKILTVNSTLRMSIYNPATFFGIHVHSTP 204
           A++P    ++ +    +   + EG D  GV TK LT N + ++ I N +  FG+H+H   
Sbjct: 103 ATQPPHPNISFRIGRFNQFMLEEGVDSHGVSTKFLTFNCSTKLIIDNKSNVFGLHIHPPS 162

Query: 205 INLVFSDISVATGELKKYYQPRKSHRMVSVNLVGKKVPLYGAGSTITESQTGVVVVSLKL 264
           I   F  ++ A  +  K Y          + +      +YGAG+ + +       + L L
Sbjct: 163 IKFFFGPLNFAKAQGPKLYGLSHESTTFQLYIATTNRAMYGAGTEMNDMLLSRAGLPLIL 222

Query: 265 KFEIVSRGNVVGKLVRTKHHKEITCPLVI 293
           +  I+S   VV  ++  K+H ++ C L++
Sbjct: 223 RTSIISDYRVVWNIINPKYHHKVECLLLL 251


>AT2G35460.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr2:14905788-14906504 FORWARD LENGTH=238
          Length = 238

 Score = 48.1 bits (113), Expect = 7e-06,   Method: Compositional matrix adjust.
 Identities = 38/137 (27%), Positives = 61/137 (44%), Gaps = 5/137 (3%)

Query: 179 LTVNSTLRMSIYNPATFFGIHVHSTPINLVFSDISVATGELKKYYQPRKSHRMVSVNLVG 238
           L  N +L  SI NP    GIH     +   + D   +   +  +YQ  K+  +V   L G
Sbjct: 102 LHYNISLNFSIRNPNQRLGIHYDQLEVRGYYGDQRFSAANMTSFYQGHKNTTVVGTELNG 161

Query: 239 KKVPLYGAGST---ITESQTGVVVVSLKLKFEIVSR-GNVVGKLVRTKHHKEITCPLVID 294
           +K+ L GAG       + ++GV  + +KL+F++  + G +    VR K    +  PL   
Sbjct: 162 QKLVLLGAGGRRDFREDRRSGVYRIDVKLRFKLRFKFGFLNSWAVRPKIKCHLKVPLSTS 221

Query: 295 SSGSKPIKFKKNSCTYD 311
           SS  +  +F    C  D
Sbjct: 222 SSDER-FQFHPTKCHVD 237