Miyakogusa Predicted Gene

Lj0g3v0312559.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj0g3v0312559.1 Non Chatacterized Hit- tr|I1K248|I1K248_SOYBN
Uncharacterized protein OS=Glycine max PE=4 SV=1,75.24,0,LEA_2,Late
embryogenesis abundant protein, LEA-14; seg,NULL,CUFF.21089.1
         (322 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT5G42860.1 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   223   2e-58
AT1G45688.1 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   203   1e-52
AT1G45688.2 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   145   4e-35
AT4G35170.1 | Symbols:  | Late embryogenesis abundant (LEA) hydr...   142   2e-34
AT2G41990.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Late embry...   130   1e-30
AT3G24600.1 | Symbols:  | Late embryogenesis abundant protein, g...   116   3e-26
AT3G08490.1 | Symbols:  | BEST Arabidopsis thaliana protein matc...    74   1e-13

>AT5G42860.1 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 19 plant structures; EXPRESSED DURING: 11
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT1G45688.1); Has 1807 Blast
           hits to 1807 proteins in 277 species: Archae - 0;
           Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
           Viruses - 0; Other Eukaryotes - 339 (source: NCBI
           BLink). | chr5:17183339-17184857 REVERSE LENGTH=320
          Length = 320

 Score =  223 bits (567), Expect = 2e-58,   Method: Compositional matrix adjust.
 Identities = 130/324 (40%), Positives = 176/324 (54%), Gaps = 30/324 (9%)

Query: 15  AKTDSEVSSLTQSSPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXXXXXXXXXXXX 74
           AKTDSEV+SL+ SSP RS PRR AYFVQSPSR+S   +D EKT  SF             
Sbjct: 3   AKTDSEVTSLSASSPTRS-PRRPAYFVQSPSRDS---HDGEKTATSFHSTPVLTSPMGSP 58

Query: 75  XXXXXXXVGLHSRESASTRYSRKTARKTPWRPRRDPIEEEGLLDPHDEAQLGFPRRCYFP 134
                      S+ + S R      ++         IEEEGLLD  D  Q   PRRCY  
Sbjct: 59  PHSHSSSSRF-SKINGSKRKGHAGEKQFAM------IEEEGLLDDGDREQEALPRRCYVL 111

Query: 135 XXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMNSTV 194
                            + A++PQKP I +KSITF++  +QAG D  G+ T +++MN+T+
Sbjct: 112 AFIVGFSLLFAFFSLILYAAAKPQKPKISVKSITFEQLKVQAGQDAGGIGTDMITMNATL 171

Query: 195 KLIFRNTATFFGVHVTSTPLDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIPLYX 254
           ++++RNT TFFGVHVTS+P+D+S+ Q+T+G+G++ KFYQSRKSQR++ V V G  IPLY 
Sbjct: 172 RMLYRNTGTFFGVHVTSSPIDLSFSQITIGSGSIKKFYQSRKSQRTVVVNVLGDKIPLYG 231

Query: 255 XXXX-------------------XXXXXXXXXXEALPLKLRVMVRSRGYVLGKLVKPKFN 295
                                              +P++L   VRSR YVLGKLV+PKF 
Sbjct: 232 SGSTLVPPPPPAPIPKPKKKKGPIVIVEPPAPPAPVPMRLNFTVRSRAYVLGKLVQPKFY 291

Query: 296 KKIECSVVMDPKKMGAPVSLVNKC 319
           K+I C +  + KK+   + + N C
Sbjct: 292 KRIVCLINFEHKKLSKHIPITNNC 315


>AT1G45688.1 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT5G42860.1); Has 258 Blast
           hits to 242 proteins in 39 species: Archae - 0; Bacteria
           - 11; Metazoa - 10; Fungi - 14; Plants - 198; Viruses -
           17; Other Eukaryotes - 8 (source: NCBI BLink). |
           chr1:17191502-17192870 FORWARD LENGTH=342
          Length = 342

 Score =  203 bits (516), Expect = 1e-52,   Method: Compositional matrix adjust.
 Identities = 128/341 (37%), Positives = 173/341 (50%), Gaps = 42/341 (12%)

Query: 15  AKTDSEVSSLTQSSPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXX---------- 64
           AKTDSEV+SL  SSP RS PRR  Y+VQSPSR+S   +D EKT  SF             
Sbjct: 3   AKTDSEVTSLAASSPARS-PRRPVYYVQSPSRDS---HDGEKTATSFHSTPVLSPMGSPP 58

Query: 65  ----XXXXXXXXXXXXXXXXXVGLHSRESASTRYSRKTAR--KTPWRPRRDPIEEEGLLD 118
                                +   SR+      S++     +  W+     IEEEGLLD
Sbjct: 59  HSHSSMGRHSRESSSSRFSGSLKPGSRKVNPNDGSKRKGHGGEKQWK-ECAVIEEEGLLD 117

Query: 119 PHDEAQLGFPRRCYFPXXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQAGA 178
             D    G PRRCY                   +GA++P KP I +KSITF+   +QAG 
Sbjct: 118 DGDRDG-GVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKITVKSITFETLKIQAGQ 176

Query: 179 DMSGVATSLVSMNSTVKLIFRNTATFFGVHVTSTPLDISYYQLTLGTGNMPKFYQSRKSQ 238
           D  GV T +++MN+T+++++RNT TFFGVHVTSTP+D+S+ Q+ +G+G++ KFYQ RKS+
Sbjct: 177 DAGGVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKIGSGSVKKFYQGRKSE 236

Query: 239 RSIKVTVKGSHIPLYXXXXX--------------------XXXXXXXXXXEALPLKLRVM 278
           R++ V V G  IPLY                                     +P+ L  +
Sbjct: 237 RTVLVHVIGEKIPLYGSGSTLLPPAPPAPLPKPKKKKGAPVPIPDPPAPPAPVPMTLSFV 296

Query: 279 VRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGAPVSLVNKC 319
           VRSR YVLGKLV+PKF KKIEC +  + K +   + +   C
Sbjct: 297 VRSRAYVLGKLVQPKFYKKIECDINFEHKNLNKHIVITKNC 337


>AT1G45688.2 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT5G42860.1); Has 35333 Blast
           hits to 34131 proteins in 2444 species: Archae - 798;
           Bacteria - 22429; Metazoa - 974; Fungi - 991; Plants -
           531; Viruses - 0; Other Eukaryotes - 9610 (source: NCBI
           BLink). | chr1:17191502-17192464 FORWARD LENGTH=248
          Length = 248

 Score =  145 bits (366), Expect = 4e-35,   Method: Compositional matrix adjust.
 Identities = 93/242 (38%), Positives = 128/242 (52%), Gaps = 26/242 (10%)

Query: 15  AKTDSEVSSLTQSSPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXX---------- 64
           AKTDSEV+SL  SSP RS PRR  Y+VQSPSR+S   +D EKT  SF             
Sbjct: 3   AKTDSEVTSLAASSPARS-PRRPVYYVQSPSRDS---HDGEKTATSFHSTPVLSPMGSPP 58

Query: 65  ----XXXXXXXXXXXXXXXXXVGLHSRESASTRYSRKTAR--KTPWRPRRDPIEEEGLLD 118
                                +   SR+      S++     +  W+     IEEEGLLD
Sbjct: 59  HSHSSMGRHSRESSSSRFSGSLKPGSRKVNPNDGSKRKGHGGEKQWK-ECAVIEEEGLLD 117

Query: 119 PHDEAQLGFPRRCYFPXXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQAGA 178
             D    G PRRCY                   +GA++P KP I +KSITF+   +QAG 
Sbjct: 118 DGDRDG-GVPRRCYVLAFIVGFFILFGFFSLILYGAAKPMKPKITVKSITFETLKIQAGQ 176

Query: 179 DMSGVATSLVSMNSTVKLIFRNTATFFGVHVTSTPLDISYYQLTLGTGN----MPKFYQS 234
           D  GV T +++MN+T+++++RNT TFFGVHVTSTP+D+S+ Q+ +G+G+    + K Y+ 
Sbjct: 177 DAGGVGTDMITMNATLRMLYRNTGTFFGVHVTSTPIDLSFSQIKIGSGSVSLPIQKLYRM 236

Query: 235 RK 236
           R+
Sbjct: 237 RE 238


>AT4G35170.1 | Symbols:  | Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family |
           chr4:16736839-16738186 FORWARD LENGTH=299
          Length = 299

 Score =  142 bits (359), Expect = 2e-34,   Method: Compositional matrix adjust.
 Identities = 71/168 (42%), Positives = 97/168 (57%)

Query: 152 WGASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMNSTVKLIFRNTATFFGVHVTS 211
           WG S+   P   LK +  +   +Q+G D SGV T ++++NSTV++++RN ATFF VHVTS
Sbjct: 129 WGVSKSFAPIATLKEMVLENLNVQSGNDQSGVLTDMLTLNSTVRILYRNPATFFTVHVTS 188

Query: 212 TPLDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIPLYXXXXXXXXXXXXXXXEAL 271
            PL +SY QL L +G M +F Q RKS+R I+  V G  IPLY                 L
Sbjct: 189 APLQLSYSQLILASGQMGEFSQRRKSERIIETKVFGDQIPLYGGVPALFGQRAEPDQVVL 248

Query: 272 PLKLRVMVRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGAPVSLVNKC 319
           PL L   +R+R YVLG+LVK  F+  I+CS+     K+G  + L   C
Sbjct: 249 PLNLTFTLRARAYVLGRLVKTTFHSNIKCSITFYGDKLGKTLDLSKSC 296


>AT2G41990.1 | Symbols:  | CONTAINS InterPro DOMAIN/s: Late
           embryogenesis abundant protein, group 2
           (InterPro:IPR004864); BEST Arabidopsis thaliana protein
           match is: Late embryogenesis abundant (LEA)
           hydroxyproline-rich glycoprotein family
           (TAIR:AT4G35170.1); Has 172 Blast hits to 168 proteins
           in 15 species: Archae - 0; Bacteria - 0; Metazoa - 0;
           Fungi - 0; Plants - 172; Viruses - 0; Other Eukaryotes -
           0 (source: NCBI BLink). | chr2:17527396-17528527 FORWARD
           LENGTH=297
          Length = 297

 Score =  130 bits (328), Expect = 1e-30,   Method: Compositional matrix adjust.
 Identities = 101/309 (32%), Positives = 148/309 (47%), Gaps = 18/309 (5%)

Query: 15  AKTDSEVSSLTQS--SPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXXXXXXXXXX 72
           AKTDSE +S+  +  SP RS+  R  Y+VQSPS     ++D EK   SF           
Sbjct: 3   AKTDSEATSIDAAALSPPRSA-IRPLYYVQSPS-----NHDVEKM--SFGSGCSLMGSPT 54

Query: 73  XXXXXXXXXVGLHSRESASTRYS-RKTARKTPWRPRRDPIEEEGLLDPHDEAQLGFPRRC 131
                    +  HSRES+++R+S R        R RR  I +        +    F    
Sbjct: 55  HPHYYHCSPIH-HSRESSTSRFSDRALLSYKSIRERRRYINDGDDKTDGGDDDDPFRNVR 113

Query: 132 YFPXXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMN 191
            +                  WGAS+   P + +K +      LQAG D+SGV T ++S+N
Sbjct: 114 LYVWLLLSVIFLFTVFSLILWGASKSYPPKVTVKGMLVRDLNLQAGNDLSGVPTDMLSLN 173

Query: 192 STVKLIFRNTATFFGVHVTSTPLDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIP 251
           STV++ +RN +TFF VHVT++PL + Y  L L +G M KF   R  + ++   V+G  IP
Sbjct: 174 STVRIYYRNPSTFFAVHVTASPLLLHYSNLLLSSGEMNKFTVGRNGETNVVTVVQGHQIP 233

Query: 252 LYXXXXXXXXXXXXXXXEALPLKLRVMVRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGA 311
           LY                +LPL L +++ S+ Y+LG+LV  KF  +I CS  +D   +  
Sbjct: 234 LYGGVSFHLDTL------SLPLNLTIVLHSKAYILGRLVTSKFYTRIICSFTLDANHLPK 287

Query: 312 PVSLVNKCI 320
            +SL+  CI
Sbjct: 288 SISLLRSCI 296


>AT3G24600.1 | Symbols:  | Late embryogenesis abundant protein,
           group 2 | chr3:8972195-8974867 REVERSE LENGTH=506
          Length = 506

 Score =  116 bits (290), Expect = 3e-26,   Method: Compositional matrix adjust.
 Identities = 60/167 (35%), Positives = 90/167 (53%), Gaps = 3/167 (1%)

Query: 152 WGASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMNSTVKLIFRNTATFFGVHVTS 211
           WGAS P  P + +KS+    F    G D +GVAT ++S NS+VK+   + A +FG+HV+S
Sbjct: 336 WGASHPFSPIVSVKSVDIHSFYYGEGIDRTGVATKILSFNSSVKVTIDSPAPYFGIHVSS 395

Query: 212 TPLDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIPLYXXXXXXXXXXXXXXXEAL 271
           +   +++  LTL TG +  +YQ RKS+    V + G+ +PLY                 +
Sbjct: 396 STFKLTFSALTLATGQLKSYYQPRKSKHISIVKLTGAEVPLYGAGPHLAASDKKGK---V 452

Query: 272 PLKLRVMVRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGAPVSLVNK 318
           P+KL   +RSRG +LGKLVK K    + CS  +   K   P+   +K
Sbjct: 453 PVKLEFEIRSRGNLLGKLVKSKHENHVSCSFFISSSKTSKPIEFTHK 499



 Score = 79.3 bits (194), Expect = 3e-15,   Method: Compositional matrix adjust.
 Identities = 65/258 (25%), Positives = 107/258 (41%), Gaps = 40/258 (15%)

Query: 11  MSSLAKTDSEVSSLTQSSPGRSSPRRNAYFVQSPSRESSNDNDAEKTTNSFXXXXXXXXX 70
           M    K+DS+V+SL  SSP     +R  Y+VQSPSR+S   +    TT+           
Sbjct: 1   MKMYPKSDSDVTSLDLSSP-----KRPTYYVQSPSRDSDKSSSVALTTHQTTPTESP--- 52

Query: 71  XXXXXXXXXXXVGLHSRESASTRYSRKTARKTPWRPRRDPIEEEGLLDPHDEAQLGFPR- 129
                          S  S ++R S        W+ RR      G+  P D+ + G  R 
Sbjct: 53  ---------------SHPSIASRVSNGGGGGFRWKGRRK--YHGGIWWPADKEEGGDGRY 95

Query: 130 -------------RCYFPXXXXXXXXXXXXXXXXXWGASRPQKPSILLKSITFDRFVLQA 176
                         C                    +GAS+   P + +K +    F    
Sbjct: 96  EDLYEDNRGVSIVTCRLILGVVATLSIFFLLCSVLFGASQSSPPIVYIKGVNVRSFYYGE 155

Query: 177 GADMSGVATSLVSMNSTVKLIFRNTATFFGVHVTSTPLDISYY-QLTLGTGNMPKFYQSR 235
           G+D +GV T ++++  +V +   N +T FG+HV+ST + + Y  Q TL    +  ++Q +
Sbjct: 156 GSDNTGVPTKIMNVKCSVVITTHNPSTLFGIHVSSTAVSLIYSRQFTLANARLKSYHQPK 215

Query: 236 KSQRSIKVTVKGSHIPLY 253
           +S  + ++ + GS +PLY
Sbjct: 216 QSNHTSRINLIGSKVPLY 233


>AT3G08490.1 | Symbols:  | BEST Arabidopsis thaliana protein match
           is: Late embryogenesis abundant protein, group 2
           (TAIR:AT3G24600.1); Has 161 Blast hits to 158 proteins
           in 15 species: Archae - 0; Bacteria - 0; Metazoa - 0;
           Fungi - 0; Plants - 161; Viruses - 0; Other Eukaryotes -
           0 (source: NCBI BLink). | chr3:2574105-2575125 REVERSE
           LENGTH=271
          Length = 271

 Score = 74.3 bits (181), Expect = 1e-13,   Method: Compositional matrix adjust.
 Identities = 41/167 (24%), Positives = 80/167 (47%), Gaps = 3/167 (1%)

Query: 154 ASRPQKPSILLKSITFDRFVLQAGADMSGVATSLVSMNSTVKLIFRNTATFFGVHVTSTP 213
           A++P  P+I  +   F++F+L+ G D  GV+T  ++ N + KLI  N +  FG+H+    
Sbjct: 103 ATQPPHPNISFRIGRFNQFMLEEGVDSHGVSTKFLTFNCSTKLIIDNKSNVFGLHIHPPS 162

Query: 214 LDISYYQLTLGTGNMPKFYQSRKSQRSIKVTVKGSHIPLYXXXXXXXXXXXXXXXEALPL 273
           +   +  L       PK Y       + ++ +  ++  +Y                 LPL
Sbjct: 163 IKFFFGPLNFAKAQGPKLYGLSHESTTFQLYIATTNRAMYGAGTEMNDMLLSRA--GLPL 220

Query: 274 KLRVMVRSRGYVLGKLVKPKFNKKIECSVVMDPKKMGAPVSLV-NKC 319
            LR  + S   V+  ++ PK++ K+EC +++  K+  + V+++  KC
Sbjct: 221 ILRTSIISDYRVVWNIINPKYHHKVECLLLLADKERHSHVTMIREKC 267