Miyakogusa Predicted Gene

Lj4g3v3114020.2
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj4g3v3114020.2 Non Chatacterized Hit- tr|G7LCF7|G7LCF7_MEDTR
Putative uncharacterized protein OS=Medicago
truncatul,85.61,0,LEA_2,Late embryogenesis abundant protein, LEA-14;
seg,NULL,CUFF.52372.2
         (279 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT1G45688.1 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   246   1e-65
AT5G42860.1 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   238   3e-63
AT1G45688.2 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   197   5e-51
AT4G35170.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...   137   9e-33
AT2G41990.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Late embry...   114   8e-26
AT3G24600.1 | Symbols:  | Late embryogenesis abundant protein, g...   104   6e-23

>AT1G45688.1 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT5G42860.1); Has 258 Blast
           hits to 242 proteins in 39 species: Archae - 0; Bacteria
           - 11; Metazoa - 10; Fungi - 14; Plants - 198; Viruses -
           17; Other Eukaryotes - 8 (source: NCBI BLink). |
           chr1:17191502-17192870 FORWARD LENGTH=342
          Length = 342

 Score =  246 bits (628), Expect = 1e-65,   Method: Compositional matrix adjust.
 Identities = 149/320 (46%), Positives = 175/320 (54%), Gaps = 42/320 (13%)

Query: 1   MHAKTDSEVTSLDASSTTRSPRRPAYYVQSPS---HDGEKTTTSFHSTPVIX-------- 49
           MHAKTDSEVTSL ASS  RSPRRP YYVQSPS   HDGEKT TSFHSTPV+         
Sbjct: 1   MHAKTDSEVTSLAASSPARSPRRPVYYVQSPSRDSHDGEKTATSFHSTPVLSPMGSPPHS 60

Query: 50  ------XXXXXXXXXXXXXXXXXXXKMNHHNHR----NNSTKPWKDIDVIEEEGLLQSQD 99
                                    K+N ++      +   K WK+  VIEEEGLL   D
Sbjct: 61  HSSMGRHSRESSSSRFSGSLKPGSRKVNPNDGSKRKGHGGEKQWKECAVIEEEGLLDDGD 120

Query: 100 HDYTRSRRXXXXXXXXXXXXXXXXXXXXXWGASRPMKPKIFVKSIKFEHLRVQAGSDATG 159
            D    RR                     +GA++PMKPKI VKSI FE L++QAG DA G
Sbjct: 121 RDGGVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKITVKSITFETLKIQAGQDAGG 180

Query: 160 VATDMITMNSTVRFTYRNTGTFFGVHVSSTPLDLSYSEIVIATGNVKQFYXXXXXXXXXX 219
           V TDMITMN+T+R  YRNTGTFFGVHV+STP+DLS+S+I I +G+VK+FY          
Sbjct: 181 VGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKIGSGSVKKFYQGRKSERTVL 240

Query: 220 XXXMGSKIPLYXXXXX---------------------XXXXXXMPTVPVPLKLSFVIRSR 258
              +G KIPLY                                 P  PVP+ LSFV+RSR
Sbjct: 241 VHVIGEKIPLYGSGSTLLPPAPPAPLPKPKKKKGAPVPIPDPPAPPAPVPMTLSFVVRSR 300

Query: 259 AYVLGKLVKPKYYKRIECSI 278
           AYVLGKLV+PK+YK+IEC I
Sbjct: 301 AYVLGKLVQPKFYKKIECDI 320


>AT5G42860.1 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 19 plant structures; EXPRESSED DURING: 11
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT1G45688.1); Has 1807 Blast
           hits to 1807 proteins in 277 species: Archae - 0;
           Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
           Viruses - 0; Other Eukaryotes - 339 (source: NCBI
           BLink). | chr5:17183339-17184857 REVERSE LENGTH=320
          Length = 320

 Score =  238 bits (608), Expect = 3e-63,   Method: Compositional matrix adjust.
 Identities = 140/302 (46%), Positives = 172/302 (56%), Gaps = 28/302 (9%)

Query: 1   MHAKTDSEVTSLDASSTTRSPRRPAYYVQSPS---HDGEKTTTSFHSTPVIXXXXXXXXX 57
           MHAKTDSEVTSL ASS TRSPRRPAY+VQSPS   HDGEKT TSFHSTPV+         
Sbjct: 1   MHAKTDSEVTSLSASSPTRSPRRPAYFVQSPSRDSHDGEKTATSFHSTPVLTSPMGSPPH 60

Query: 58  XXXXXXXXXXXKMNHHNHRNNSTKPWKDIDVIEEEGLLQSQDHDY-TRSRRXXXXXXXXX 116
                      K+N    + ++ +  K   +IEEEGLL   D +     RR         
Sbjct: 61  SHSSSSRFS--KINGSKRKGHAGE--KQFAMIEEEGLLDDGDREQEALPRRCYVLAFIVG 116

Query: 117 XXXXXXXXXXXXWGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYR 176
                       + A++P KPKI VKSI FE L+VQAG DA G+ TDMITMN+T+R  YR
Sbjct: 117 FSLLFAFFSLILYAAAKPQKPKISVKSITFEQLKVQAGQDAGGIGTDMITMNATLRMLYR 176

Query: 177 NTGTFFGVHVSSTPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXX 236
           NTGTFFGVHV+S+P+DLS+S+I I +G++K+FY             +G KIPLY      
Sbjct: 177 NTGTFFGVHVTSSPIDLSFSQITIGSGSIKKFYQSRKSQRTVVVNVLGDKIPLYGSGSTL 236

Query: 237 X--------------------XXXXMPTVPVPLKLSFVIRSRAYVLGKLVKPKYYKRIEC 276
                                     P  PVP++L+F +RSRAYVLGKLV+PK+YKRI C
Sbjct: 237 VPPPPPAPIPKPKKKKGPIVIVEPPAPPAPVPMRLNFTVRSRAYVLGKLVQPKFYKRIVC 296

Query: 277 SI 278
            I
Sbjct: 297 LI 298


>AT1G45688.2 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT5G42860.1); Has 35333 Blast
           hits to 34131 proteins in 2444 species: Archae - 798;
           Bacteria - 22429; Metazoa - 974; Fungi - 991; Plants -
           531; Viruses - 0; Other Eukaryotes - 9610 (source: NCBI
           BLink). | chr1:17191502-17192464 FORWARD LENGTH=248
          Length = 248

 Score =  197 bits (502), Expect = 5e-51,   Method: Compositional matrix adjust.
 Identities = 113/226 (50%), Positives = 132/226 (58%), Gaps = 21/226 (9%)

Query: 1   MHAKTDSEVTSLDASSTTRSPRRPAYYVQSPS---HDGEKTTTSFHSTPVIX-------- 49
           MHAKTDSEVTSL ASS  RSPRRP YYVQSPS   HDGEKT TSFHSTPV+         
Sbjct: 1   MHAKTDSEVTSLAASSPARSPRRPVYYVQSPSRDSHDGEKTATSFHSTPVLSPMGSPPHS 60

Query: 50  ------XXXXXXXXXXXXXXXXXXXKMNHHNHR----NNSTKPWKDIDVIEEEGLLQSQD 99
                                    K+N ++      +   K WK+  VIEEEGLL   D
Sbjct: 61  HSSMGRHSRESSSSRFSGSLKPGSRKVNPNDGSKRKGHGGEKQWKECAVIEEEGLLDDGD 120

Query: 100 HDYTRSRRXXXXXXXXXXXXXXXXXXXXXWGASRPMKPKIFVKSIKFEHLRVQAGSDATG 159
            D    RR                     +GA++PMKPKI VKSI FE L++QAG DA G
Sbjct: 121 RDGGVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKITVKSITFETLKIQAGQDAGG 180

Query: 160 VATDMITMNSTVRFTYRNTGTFFGVHVSSTPLDLSYSEIVIATGNV 205
           V TDMITMN+T+R  YRNTGTFFGVHV+STP+DLS+S+I I +G+V
Sbjct: 181 VGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKIGSGSV 226


>AT4G35170.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr4:16736839-16738186 FORWARD LENGTH=299
          Length = 299

 Score =  137 bits (345), Expect = 9e-33,   Method: Compositional matrix adjust.
 Identities = 67/152 (44%), Positives = 93/152 (61%), Gaps = 1/152 (0%)

Query: 129 WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 188
           WG S+   P   +K +  E+L VQ+G+D +GV TDM+T+NSTVR  YRN  TFF VHV+S
Sbjct: 129 WGVSKSFAPIATLKEMVLENLNVQSGNDQSGVLTDMLTLNSTVRILYRNPATFFTVHVTS 188

Query: 189 TPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMP-TVPV 247
            PL LSYS++++A+G + +F               G +IPLY            P  V +
Sbjct: 189 APLQLSYSQLILASGQMGEFSQRRKSERIIETKVFGDQIPLYGGVPALFGQRAEPDQVVL 248

Query: 248 PLKLSFVIRSRAYVLGKLVKPKYYKRIECSIT 279
           PL L+F +R+RAYVLG+LVK  ++  I+CSIT
Sbjct: 249 PLNLTFTLRARAYVLGRLVKTTFHSNIKCSIT 280


>AT2G41990.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Late
           embryogenesis abundant protein, group 2
           (InterPro:IPR004864); BEST Arabidopsis thaliana protein
           match is: Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family
           (TAIR:AT4G35170.1); Has 172 Blast hits to 168 proteins
           in 15 species: Archae - 0; Bacteria - 0; Metazoa - 0;
           Fungi - 0; Plants - 172; Viruses - 0; Other Eukaryotes -
           0 (source: NCBI BLink). | chr2:17527396-17528527 FORWARD
           LENGTH=297
          Length = 297

 Score =  114 bits (284), Expect = 8e-26,   Method: Compositional matrix adjust.
 Identities = 63/151 (41%), Positives = 92/151 (60%), Gaps = 5/151 (3%)

Query: 129 WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 188
           WGAS+   PK+ VK +    L +QAG+D +GV TDM+++NSTVR  YRN  TFF VHV++
Sbjct: 134 WGASKSYPPKVTVKGMLVRDLNLQAGNDLSGVPTDMLSLNSTVRIYYRNPSTFFAVHVTA 193

Query: 189 TPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMPTVPVP 248
           +PL L YS +++++G + +F               G +IPLY           + T+ +P
Sbjct: 194 SPLLLHYSNLLLSSGEMNKFTVGRNGETNVVTVVQGHQIPLY-----GGVSFHLDTLSLP 248

Query: 249 LKLSFVIRSRAYVLGKLVKPKYYKRIECSIT 279
           L L+ V+ S+AY+LG+LV  K+Y RI CS T
Sbjct: 249 LNLTIVLHSKAYILGRLVTSKFYTRIICSFT 279



 Score = 49.3 bits (116), Expect = 3e-06,   Method: Compositional matrix adjust.
 Identities = 27/42 (64%), Positives = 32/42 (76%), Gaps = 3/42 (7%)

Query: 1  MHAKTDSEVTSLDASSTT--RSPRRPAYYVQSPS-HDGEKTT 39
          MHAKTDSE TS+DA++ +  RS  RP YYVQSPS HD EK +
Sbjct: 1  MHAKTDSEATSIDAAALSPPRSAIRPLYYVQSPSNHDVEKMS 42


>AT3G24600.1 | Symbols:  | Late embryogenesis abundant protein,
           group 2 | chr3:8972195-8974867 REVERSE LENGTH=506
          Length = 506

 Score =  104 bits (260), Expect = 6e-23,   Method: Compositional matrix adjust.
 Identities = 56/150 (37%), Positives = 79/150 (52%), Gaps = 2/150 (1%)

Query: 129 WGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTVRFTYRNTGTFFGVHVSS 188
           WGAS P  P + VKS+         G D TGVAT +++ NS+V+ T  +   +FG+HVSS
Sbjct: 336 WGASHPFSPIVSVKSVDIHSFYYGEGIDRTGVATKILSFNSSVKVTIDSPAPYFGIHVSS 395

Query: 189 TPLDLSYSEIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLYXXXXXXXXXXXMPTVPVP 248
           +   L++S + +ATG +K +Y              G+++PLY              VPV 
Sbjct: 396 STFKLTFSALTLATGQLKSYYQPRKSKHISIVKLTGAEVPLYGAGPHLAASDKKGKVPV- 454

Query: 249 LKLSFVIRSRAYVLGKLVKPKYYKRIECSI 278
            KL F IRSR  +LGKLVK K+   + CS 
Sbjct: 455 -KLEFEIRSRGNLLGKLVKSKHENHVSCSF 483



 Score = 82.4 bits (202), Expect = 3e-16,   Method: Compositional matrix adjust.
 Identities = 71/257 (27%), Positives = 102/257 (39%), Gaps = 19/257 (7%)

Query: 1   MHAKTDSEVTSLDASSTTRSPRRPAYYVQSPSHDGEKTT----TSFHSTPVIXXXXXXXX 56
           M+ K+DS+VTSLD SS    P+RP YYVQSPS D +K++    T+  +TP          
Sbjct: 3   MYPKSDSDVTSLDLSS----PKRPTYYVQSPSRDSDKSSSVALTTHQTTPT---ESPSHP 55

Query: 57  XXXXXXXXXXXXKMNHHNHRNNSTKPWKDIDVIEEEGLLQSQDHDYTRSRRXXXXXXXXX 116
                              R      W   D  +EEG     +  Y  +R          
Sbjct: 56  SIASRVSNGGGGGFRWKGRRKYHGGIWWPAD--KEEGGDGRYEDLYEDNRGVSIVTCRLI 113

Query: 117 XXXXXXXX-----XXXXWGASRPMKPKIFVKSIKFEHLRVQAGSDATGVATDMITMNSTV 171
                            +GAS+   P +++K +         GSD TGV T ++ +  +V
Sbjct: 114 LGVVATLSIFFLLCSVLFGASQSSPPIVYIKGVNVRSFYYGEGSDNTGVPTKIMNVKCSV 173

Query: 172 RFTYRNTGTFFGVHVSSTPLDLSYS-EIVIATGNVKQFYXXXXXXXXXXXXXMGSKIPLY 230
             T  N  T FG+HVSST + L YS +  +A   +K ++             +GSK+PLY
Sbjct: 174 VITTHNPSTLFGIHVSSTAVSLIYSRQFTLANARLKSYHQPKQSNHTSRINLIGSKVPLY 233

Query: 231 XXXXXXXXXXXMPTVPV 247
                         VPV
Sbjct: 234 GAGAELVASDNSGGVPV 250