Miyakogusa Predicted Gene

Lj1g3v2313060.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj1g3v2313060.1 Non Chatacterized Hit- tr|I1R565|I1R565_ORYGL
Uncharacterized protein OS=Oryza glaberrima PE=4
SV=1,30.16,4e-18,seg,NULL; SUBFAMILY NOT NAMED,NULL; FAMILY NOT
NAMED,NULL,CUFF.28871.1
         (505 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT2G40070.1 | Symbols:  | BEST Arabidopsis thaliana protein matc...   201   1e-51
AT2G40070.2 | Symbols:  | FUNCTIONS IN: molecular_function unkno...   201   1e-51
AT3G09000.1 | Symbols:  | proline-rich family protein | chr3:274...   144   1e-34
AT5G01280.1 | Symbols:  | BEST Arabidopsis thaliana protein matc...   101   1e-21
AT2G38160.2 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...    70   3e-12
AT2G38160.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...    70   3e-12
AT1G27850.1 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...    53   4e-07
AT3G08670.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...    52   8e-07

>AT2G40070.1 | Symbols:  | BEST Arabidopsis thaliana protein match
           is: proline-rich family protein (TAIR:AT3G09000.1); Has
           35333 Blast hits to 34131 proteins in 2444 species:
           Archae - 798; Bacteria - 22429; Metazoa - 974; Fungi -
           991; Plants - 531; Viruses - 0; Other Eukaryotes - 9610
           (source: NCBI BLink). | chr2:16728378-16731160 REVERSE
           LENGTH=607
          Length = 607

 Score =  201 bits (511), Expect = 1e-51,   Method: Compositional matrix adjust.
 Identities = 130/272 (47%), Positives = 163/272 (59%), Gaps = 15/272 (5%)

Query: 239 RPSSASKARPIV--AKNPAQSRGISPSVKSRPWEPSQMPGYSLEAPPNLKTSLPERPASA 296
           +PSS + A+P+   +KNPA SR  SP+V+SRPW+PS MPG+SLE PPNL+T+LPERP SA
Sbjct: 339 KPSSPAPAKPMPTPSKNPALSRAASPTVRSRPWKPSDMPGFSLETPPNLRTTLPERPLSA 398

Query: 297 TRSRPGAQNTXXXX--XXXXXXXXXXXXXXXXXKGRA---STGFALLNYSSMQALSRARF 351
           TR RPGA ++                       +GRA   S+G      SS+ A++R   
Sbjct: 399 TRGRPGAPSSRSGSVEPGGPPGGRPRRQSCSPSRGRAPMYSSG------SSVPAVNRGYS 452

Query: 352 TDGDHDSPGEVGTKMVERVVNMRKLAPPKREDXXXXXXXXXXXXXXXXXXXFGCTLSKTS 411
              D+ SP  +GTKMVERV+NMRKLAPP+ +D                   FG TLSK S
Sbjct: 453 KASDNVSPVMMGTKMVERVINMRKLAPPRSDDKGSPHGNLSAKSSSPDSAGFGRTLSKKS 512

Query: 412 LDMAKRHMDIRRSIQGNLRPVVTNIPASSTYNVRSASASKSRTISVSD-SPLATSSTAXX 470
           LDMA RHMDIRR+I GNLRP++TNIPASS Y+VRS   ++ R ++VSD SPLATSS A  
Sbjct: 513 LDMAIRHMDIRRTIPGNLRPLMTNIPASSMYSVRSGH-TRGRPMNVSDSSPLATSSNASS 571

Query: 471 XXXXXXXXXXYDGSEIGENDFGSERGNSSPMS 502
                        +   E+D GSERG  SP S
Sbjct: 572 EISVCNNNGICLEASEKEDDAGSERGCRSPAS 603



 Score = 84.3 bits (207), Expect = 2e-16,   Method: Compositional matrix adjust.
 Identities = 52/136 (38%), Positives = 69/136 (50%), Gaps = 3/136 (2%)

Query: 2   VMKERDEELSLFLEMRRRXXXXXXXXXXXXXXXXXXXXXXXXXRGSSMISKTMILVPP-R 60
           +M E+DEELSLFLEMRRR                          G+S +       PP R
Sbjct: 27  MMAEKDEELSLFLEMRRREKEQDNLLLNNNPDEFETPLGSK--HGTSPVFNISSGAPPSR 84

Query: 61  KTGVEVFLNSENGKSEYEWLLTPPDSPRFPTLEKQSQISAKNDMETRNARPTALKPRVAN 120
           K   + FLNSE  K++YEWLLTPP +P FP+LE +S  +  +      +RP  L  R+AN
Sbjct: 85  KAAPDDFLNSEGDKNDYEWLLTPPGTPLFPSLEMESHRTMMSQTGDSKSRPATLTSRLAN 144

Query: 121 IQAEPAARSNAVSKNH 136
              E AAR++  S+  
Sbjct: 145 SSTESAARNHLTSRQQ 160


>AT2G40070.2 | Symbols:  | FUNCTIONS IN: molecular_function unknown;
           INVOLVED IN: biological_process unknown; LOCATED IN:
           cellular_component unknown; EXPRESSED IN: 17 plant
           structures; EXPRESSED DURING: 7 growth stages; BEST
           Arabidopsis thaliana protein match is: proline-rich
           family protein (TAIR:AT3G09000.1); Has 108635 Blast hits
           to 60786 proteins in 2176 species: Archae - 287;
           Bacteria - 15142; Metazoa - 39415; Fungi - 26849; Plants
           - 4416; Viruses - 2864; Other Eukaryotes - 19662
           (source: NCBI BLink). | chr2:16728378-16731040 REVERSE
           LENGTH=567
          Length = 567

 Score =  201 bits (510), Expect = 1e-51,   Method: Compositional matrix adjust.
 Identities = 130/272 (47%), Positives = 163/272 (59%), Gaps = 15/272 (5%)

Query: 239 RPSSASKARPIV--AKNPAQSRGISPSVKSRPWEPSQMPGYSLEAPPNLKTSLPERPASA 296
           +PSS + A+P+   +KNPA SR  SP+V+SRPW+PS MPG+SLE PPNL+T+LPERP SA
Sbjct: 299 KPSSPAPAKPMPTPSKNPALSRAASPTVRSRPWKPSDMPGFSLETPPNLRTTLPERPLSA 358

Query: 297 TRSRPGAQNTXXXX--XXXXXXXXXXXXXXXXXKGRA---STGFALLNYSSMQALSRARF 351
           TR RPGA ++                       +GRA   S+G      SS+ A++R   
Sbjct: 359 TRGRPGAPSSRSGSVEPGGPPGGRPRRQSCSPSRGRAPMYSSG------SSVPAVNRGYS 412

Query: 352 TDGDHDSPGEVGTKMVERVVNMRKLAPPKREDXXXXXXXXXXXXXXXXXXXFGCTLSKTS 411
              D+ SP  +GTKMVERV+NMRKLAPP+ +D                   FG TLSK S
Sbjct: 413 KASDNVSPVMMGTKMVERVINMRKLAPPRSDDKGSPHGNLSAKSSSPDSAGFGRTLSKKS 472

Query: 412 LDMAKRHMDIRRSIQGNLRPVVTNIPASSTYNVRSASASKSRTISVSD-SPLATSSTAXX 470
           LDMA RHMDIRR+I GNLRP++TNIPASS Y+VRS   ++ R ++VSD SPLATSS A  
Sbjct: 473 LDMAIRHMDIRRTIPGNLRPLMTNIPASSMYSVRSGH-TRGRPMNVSDSSPLATSSNASS 531

Query: 471 XXXXXXXXXXYDGSEIGENDFGSERGNSSPMS 502
                        +   E+D GSERG  SP S
Sbjct: 532 EISVCNNNGICLEASEKEDDAGSERGCRSPAS 563



 Score = 67.8 bits (164), Expect = 2e-11,   Method: Compositional matrix adjust.
 Identities = 34/79 (43%), Positives = 47/79 (59%)

Query: 58  PPRKTGVEVFLNSENGKSEYEWLLTPPDSPRFPTLEKQSQISAKNDMETRNARPTALKPR 117
           P RK   + FLNSE  K++YEWLLTPP +P FP+LE +S  +  +      +RP  L  R
Sbjct: 42  PSRKAAPDDFLNSEGDKNDYEWLLTPPGTPLFPSLEMESHRTMMSQTGDSKSRPATLTSR 101

Query: 118 VANIQAEPAARSNAVSKNH 136
           +AN   E AAR++  S+  
Sbjct: 102 LANSSTESAARNHLTSRQQ 120


>AT3G09000.1 | Symbols:  | proline-rich family protein |
           chr3:2746014-2748326 FORWARD LENGTH=541
          Length = 541

 Score =  144 bits (363), Expect = 1e-34,   Method: Compositional matrix adjust.
 Identities = 108/259 (41%), Positives = 134/259 (51%), Gaps = 28/259 (10%)

Query: 255 AQSRGISPS---VKSRPWEPSQMPGYSLEAPPNLKTSLPERPASATRSRPGAQNTXXXXX 311
           A SRG SPS     SRPW+P +MPG+SLEAPPNL+T+L +RP SA+R RPG  +      
Sbjct: 278 APSRGTSPSPTLNSSRPWKPPEMPGFSLEAPPNLRTTLADRPVSASRGRPGVASAPGSRS 337

Query: 312 XX---------XXXXXXXXXXXXXXKGRASTGFALLNYSSMQALSRARFTDG----DHDS 358
                                    +GRA  G    N S      RA+ ++G    D+ S
Sbjct: 338 GSIERGGGPTSGGSGNARRQSCSPSRGRAPIGNT--NGSLTGVRGRAKASNGGSGCDNLS 395

Query: 359 PGEVGTKMVERVVNMRKLAPPKREDXXXXXXXXXXXXXXXXXXXFGCTLSKTSLDMAKRH 418
           P  +G KMVERVVNMRKL PP+  +                   +G  LSK+S+DMA RH
Sbjct: 396 PVAMGNKMVERVVNMRKLGPPRLTENGGRGSGKSSSAFNSLG--YGRNLSKSSIDMAIRH 453

Query: 419 MDIRRSIQGNLRPVVTNIPASSTYNVRSASASKSRTISVSDSPLATSSTAXXXXXXXXXX 478
           MDIRR + GNLRP+VT +PASS Y+VR      SR  SVS SP+ATSST           
Sbjct: 454 MDIRRGMTGNLRPLVTKVPASSMYSVR------SRPGSVSSSPVATSSTVSSSDPSVDNI 507

Query: 479 X--XYDGSEIGENDFGSER 495
                DG+E   +D  SER
Sbjct: 508 NILCLDGNEAENDDLLSER 526



 Score = 82.4 bits (202), Expect = 6e-16,   Method: Compositional matrix adjust.
 Identities = 55/147 (37%), Positives = 75/147 (51%), Gaps = 8/147 (5%)

Query: 1   MVMKERDEELSLFLEMRRRXXXXXXXX--XXXXXXXXXXXXXXXXXRGSSMISKTM--IL 56
           M+  +RDEELSLFLEMRRR                              S +S+T     
Sbjct: 1   MLTHDRDEELSLFLEMRRREKEHRADSLLTGSDNVSINATLTAAAAAALSGVSETASSQR 60

Query: 57  VPPRKTGVEVFLNSENGKSEYEWLLTPPDSPRFPTLEKQSQISAKNDMETRNARPTALKP 116
            P R+T  E FL SEN KS+Y+WLLTPP +P+F   EK+S  S  N  +  N+RPT LK 
Sbjct: 61  YPLRRTAAENFLYSENEKSDYDWLLTPPGTPQF---EKESHRSVMNQHDAPNSRPTVLKS 117

Query: 117 RVANIQAE-PAARSNAVSKNHAAVTGL 142
           R+ N + +  +  +N    + ++V GL
Sbjct: 118 RLGNCREDIVSGNNNKPQTSSSSVAGL 144


>AT5G01280.1 | Symbols:  | BEST Arabidopsis thaliana protein match
           is: proline-rich family protein (TAIR:AT3G09000.1); Has
           1807 Blast hits to 1807 proteins in 277 species: Archae
           - 0; Bacteria - 0; Metazoa - 736; Fungi - 347; Plants -
           385; Viruses - 0; Other Eukaryotes - 339 (source: NCBI
           BLink). | chr5:114185-116237 REVERSE LENGTH=460
          Length = 460

 Score =  101 bits (252), Expect = 1e-21,   Method: Compositional matrix adjust.
 Identities = 75/204 (36%), Positives = 101/204 (49%), Gaps = 10/204 (4%)

Query: 254 PAQSRGISPSVKSRPWEPSQMPGYSLEAPPNLKTSLPERPASATRSRPGAQNTXXXXXXX 313
           PA S   SP V+SRPWEP +MPG+S+EAP NL+T+LP+RP +A+ SR  A +        
Sbjct: 224 PALSLEASPIVRSRPWEPYEMPGFSVEAPSNLRTTLPDRPQTASSSRTRAFDASSSSRSA 283

Query: 314 XXXXXXXXXXXXX-XKGRASTGFALLNYSSMQAL-SRARFTDGDHDSPGEVGTKMVERVV 371
                          + RA  G       S++   ++    DG   S    G + VE+VV
Sbjct: 284 STERDVAKRQSCSPSRSRAPNGNVNGAVPSLRGQRAKTNNDDGRLISHAAKGNQKVEKVV 343

Query: 372 NMRKLAPPKREDXXXXXXXXXXXXXXXXXXXFGC-------TLSKTSLDMAKRHMDIRR- 423
           NMRKLA P+  +                    G         LSK+S+DMA RHMD+R+ 
Sbjct: 344 NMRKLATPRLTESGSRRLGGGGGDSSAGKSSSGSGGFGFGRNLSKSSIDMALRHMDVRKG 403

Query: 424 SIQGNLRPVVTNIPASSTYNVRSA 447
           S+ GN R  VT  PA+S Y+VRS 
Sbjct: 404 SMAGNFRHSVTKAPATSVYSVRSC 427


>AT2G38160.2 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN:
           cellular_component unknown; EXPRESSED IN: 9 plant
           structures; EXPRESSED DURING: 4 anthesis, F mature
           embryo stage, petal differentiation and expansion stage,
           E expanded cotyledon stage, D bilateral stage; BEST
           Arabidopsis thaliana protein match is: unknown protein
           (TAIR:AT2G40070.2). | chr2:15986643-15988464 REVERSE
           LENGTH=314
          Length = 314

 Score = 70.5 bits (171), Expect = 3e-12,   Method: Compositional matrix adjust.
 Identities = 38/76 (50%), Positives = 46/76 (60%), Gaps = 14/76 (18%)

Query: 359 PGEVGTKMVERVVNMRKLAPPKREDXXXXXXXXXXXXXXXXXXXFGCTLSKTSLDMAKRH 418
           P  +GT+MVERVVNMRKL PPK +D                   FG TLS++SLDMA RH
Sbjct: 241 PVLMGTQMVERVVNMRKLPPPKHDDNTTLG--------------FGRTLSRSSLDMALRH 286

Query: 419 MDIRRSIQGNLRPVVT 434
           M+IR S+  NLR  V+
Sbjct: 287 MNIRHSVSKNLRVTVS 302


>AT2G38160.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT2G40070.2); Has 972 Blast hits to 731 proteins
           in 211 species: Archae - 0; Bacteria - 236; Metazoa -
           194; Fungi - 201; Plants - 218; Viruses - 32; Other
           Eukaryotes - 91 (source: NCBI BLink). |
           chr2:15986643-15988464 REVERSE LENGTH=314
          Length = 314

 Score = 70.5 bits (171), Expect = 3e-12,   Method: Compositional matrix adjust.
 Identities = 38/76 (50%), Positives = 46/76 (60%), Gaps = 14/76 (18%)

Query: 359 PGEVGTKMVERVVNMRKLAPPKREDXXXXXXXXXXXXXXXXXXXFGCTLSKTSLDMAKRH 418
           P  +GT+MVERVVNMRKL PPK +D                   FG TLS++SLDMA RH
Sbjct: 241 PVLMGTQMVERVVNMRKLPPPKHDDNTTLG--------------FGRTLSRSSLDMALRH 286

Query: 419 MDIRRSIQGNLRPVVT 434
           M+IR S+  NLR  V+
Sbjct: 287 MNIRHSVSKNLRVTVS 302


>AT1G27850.1 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; LOCATED IN: plasma membrane;
           EXPRESSED IN: 22 plant structures; EXPRESSED DURING: 13
           growth stages; BEST Arabidopsis thaliana protein match
           is: unknown protein (TAIR:AT2G40070.1); Has 9215 Blast
           hits to 5316 proteins in 473 species: Archae - 6;
           Bacteria - 773; Metazoa - 3392; Fungi - 1710; Plants -
           539; Viruses - 143; Other Eukaryotes - 2652 (source:
           NCBI BLink). | chr1:9699265-9703701 FORWARD LENGTH=1148
          Length = 1148

 Score = 53.1 bits (126), Expect = 4e-07,   Method: Compositional matrix adjust.
 Identities = 29/50 (58%), Positives = 38/50 (76%), Gaps = 2/50 (4%)

Query: 257 SRGISPSVKSRPWEPSQMPGYSLEAPPNLKTSLPERPASATR-SRPGAQN 305
           SRG SPS K + W+ S +PG+SL+APPNL+TSL +RPAS  R S P ++N
Sbjct: 234 SRGNSPSPKIKVWQ-SNIPGFSLDAPPNLRTSLGDRPASYVRGSSPASRN 282


>AT3G08670.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT3G51540.1); Has 48380 Blast hits to 29827
           proteins in 1356 species: Archae - 46; Bacteria - 5589;
           Metazoa - 17361; Fungi - 13192; Plants - 2237; Viruses -
           905; Other Eukaryotes - 9050 (source: NCBI BLink). |
           chr3:2633946-2636536 FORWARD LENGTH=567
          Length = 567

 Score = 52.4 bits (124), Expect = 8e-07,   Method: Compositional matrix adjust.
 Identities = 47/165 (28%), Positives = 66/165 (40%), Gaps = 29/165 (17%)

Query: 262 PSVKSRPWEPSQMPGYSLEAPPNLKTSLPERPASATRSRP-GAQNTXXXXXXXXXXXXXX 320
           P V++ P +P  +  + L+ PPNL+TSLP+RP SA RSRP G  +               
Sbjct: 313 PRVRNTPQQPIVLADFPLDTPPNLRTSLPDRPISAGRSRPVGGSSMAKASPEPKGPITRR 372

Query: 321 XXXXXXXKGRASTGFALLNYSSMQALSRARF-TDGDH--DSPGEVGTKMVERVVNMRKLA 377
                  +GR +           +   + RF  +G H  D+P         R+ N+    
Sbjct: 373 NSSPIVTRGRLT-----------ETQGKGRFGGNGQHLTDAP------EPRRISNV---- 411

Query: 378 PPKREDXXXXXXXXXXXXXXXXXXXFGCTLSKTSLDMAKRHMDIR 422
                D                    G + SK+SLDMA RHMDIR
Sbjct: 412 ----SDITSRRTVKTSTTVTDNNNGLGRSFSKSSLDMAIRHMDIR 452