Miyakogusa Predicted Gene

Lj0g3v0361759.1
Show Alignment: 

BLASTP 2.2.25 [Feb-01-2011]


Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs",  Nucleic Acids Res. 25:3389-3402.

Reference for compositional score matrix adjustment: Altschul, Stephen F., 
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.

Query= Lj0g3v0361759.1 Non Chatacterized Hit- tr|I1MYS4|I1MYS4_SOYBN
Uncharacterized protein OS=Glycine max PE=4 SV=1,80.19,0,SUBFAMILY NOT
NAMED,NULL; FAMILY NOT NAMED,NULL; seg,NULL; DUF4057,Domain of unknown
function DUF405,CUFF.24971.1
         (308 letters)

Database: TAIR10_pep 
           35,386 sequences; 14,482,855 total letters

Searching..................................................done



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

AT4G39860.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...   353   1e-97
AT4G39860.2 | Symbols:  | unknown protein; BEST Arabidopsis thal...   346   1e-95
AT1G35780.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...   265   3e-71
AT1G78150.2 | Symbols:  | unknown protein; BEST Arabidopsis thal...   259   2e-69
AT1G78150.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...   259   2e-69
AT1G78150.3 | Symbols:  | unknown protein; FUNCTIONS IN: molecul...   244   4e-65
AT2G22270.1 | Symbols:  | unknown protein; BEST Arabidopsis thal...   216   2e-56
AT2G22260.1 | Symbols:  | oxidoreductase, 2OG-Fe(II) oxygenase f...    51   1e-06

>AT4G39860.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT2G22270.1); Has 152 Blast hits to 146 proteins
           in 19 species: Archae - 0; Bacteria - 0; Metazoa - 0;
           Fungi - 2; Plants - 146; Viruses - 0; Other Eukaryotes -
           4 (source: NCBI BLink). | chr4:18499909-18501472 FORWARD
           LENGTH=299
          Length = 299

 Score =  353 bits (905), Expect = 1e-97,   Method: Compositional matrix adjust.
 Identities = 186/308 (60%), Positives = 228/308 (74%), Gaps = 9/308 (2%)

Query: 1   MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
           M  NTPVR PHTST+DLL+WSE                  RS QPSD ISK+L GGQ+T+
Sbjct: 1   MERNTPVRNPHTSTADLLSWSE-----TPPPPHHSTPSAARSHQPSDGISKILGGGQITD 55

Query: 61  EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANSNGRTSRRLVQQAVNGISQ 120
           EEAQSL K K CSGYK+KE+TGSGIF+   +        A ++ +T  R  QQ +NG+SQ
Sbjct: 56  EEAQSLNKLKNCSGYKLKEMTGSGIFTDKGK--VGSESDATTDPKTGLRYYQQTLNGMSQ 113

Query: 121 ISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGPPP 180
           ISFS + +VSPKKP T+ EVAKQRELSG L +E D KSNK IS+AK +E++G+DIF PP 
Sbjct: 114 ISFSADGNVSPKKPTTLTEVAKQRELSGNLLTEADLKSNKQISSAKIEEISGHDIFAPPS 173

Query: 181 EIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKKIH 240
           EI PRS+ AA+  E++G++DMGEP PRNLRTSVKVSNPAGGQSN LF E PV+KTSKKIH
Sbjct: 174 EIQPRSLVAAQ-QEARGNRDMGEPAPRNLRTSVKVSNPAGGQSNILFSEEPVVKTSKKIH 232

Query: 241 DHKLAELTGTNIFQGNNPPGSAEKPLSRAKLREMTGSNIFAADAKAETKDPIRGSRQPPG 300
           + K  ELTG  IF+G+  PGSA+K LS AKLREM+G+NIF AD K+E++D   G R+PPG
Sbjct: 233 NQKFQELTGNGIFKGDESPGSADKQLSSAKLREMSGNNIF-ADGKSESRDYFGGVRKPPG 291

Query: 301 GESSIALL 308
           GESSI+L+
Sbjct: 292 GESSISLV 299


>AT4G39860.2 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT2G22270.1); Has 148 Blast hits to 144 proteins
           in 18 species: Archae - 0; Bacteria - 0; Metazoa - 0;
           Fungi - 0; Plants - 144; Viruses - 0; Other Eukaryotes -
           4 (source: NCBI BLink). | chr4:18499909-18501472 FORWARD
           LENGTH=298
          Length = 298

 Score =  346 bits (887), Expect = 1e-95,   Method: Compositional matrix adjust.
 Identities = 185/308 (60%), Positives = 227/308 (73%), Gaps = 10/308 (3%)

Query: 1   MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
           M  NTPVR PHTST+DLL+WSE                  RS QPSD ISK+L GGQ+T+
Sbjct: 1   MERNTPVRNPHTSTADLLSWSE-----TPPPPHHSTPSAARSHQPSDGISKILGGGQITD 55

Query: 61  EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANSNGRTSRRLVQQAVNGISQ 120
           EEAQSL K K CSGYK+KE+TGSGIF+   +        A ++ +T  R  Q  +NG+SQ
Sbjct: 56  EEAQSLNKLKNCSGYKLKEMTGSGIFTDKGK--VGSESDATTDPKTGLRYYQ-TLNGMSQ 112

Query: 121 ISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGPPP 180
           ISFS + +VSPKKP T+ EVAKQRELSG L +E D KSNK IS+AK +E++G+DIF PP 
Sbjct: 113 ISFSADGNVSPKKPTTLTEVAKQRELSGNLLTEADLKSNKQISSAKIEEISGHDIFAPPS 172

Query: 181 EIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKKIH 240
           EI PRS+ AA+  E++G++DMGEP PRNLRTSVKVSNPAGGQSN LF E PV+KTSKKIH
Sbjct: 173 EIQPRSLVAAQ-QEARGNRDMGEPAPRNLRTSVKVSNPAGGQSNILFSEEPVVKTSKKIH 231

Query: 241 DHKLAELTGTNIFQGNNPPGSAEKPLSRAKLREMTGSNIFAADAKAETKDPIRGSRQPPG 300
           + K  ELTG  IF+G+  PGSA+K LS AKLREM+G+NIF AD K+E++D   G R+PPG
Sbjct: 232 NQKFQELTGNGIFKGDESPGSADKQLSSAKLREMSGNNIF-ADGKSESRDYFGGVRKPPG 290

Query: 301 GESSIALL 308
           GESSI+L+
Sbjct: 291 GESSISLV 298


>AT1G35780.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT1G78150.2); Has 145 Blast hits to 144 proteins
           in 16 species: Archae - 0; Bacteria - 0; Metazoa - 0;
           Fungi - 0; Plants - 145; Viruses - 0; Other Eukaryotes -
           0 (source: NCBI BLink). | chr1:13277778-13280113 REVERSE
           LENGTH=286
          Length = 286

 Score =  265 bits (677), Expect = 3e-71,   Method: Compositional matrix adjust.
 Identities = 161/313 (51%), Positives = 199/313 (63%), Gaps = 32/313 (10%)

Query: 1   MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
           M  NTPVRKPH ST+DLLTW E  P               RS QPSD ISKV+ GGQ+T+
Sbjct: 1   MEKNTPVRKPHMSTADLLTWPENQP--FESPAAVSSRSAARSHQPSDGISKVVFGGQVTD 58

Query: 61  EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANS--NGRTSRRLVQQAVNGI 118
           EE +SL K KPCS YKMKEITGSGIFS   E+  SE  SANS  NG+ SR   Q     +
Sbjct: 59  EEVESLNKRKPCSNYKMKEITGSGIFSVYEENDDSELASANSATNGK-SRTFQQPPAAIM 117

Query: 119 SQISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGP 178
           S ISF  EE V+PKKPAT+PEVAKQRELSGTL+ + D+K NK  S+AK KEL+G++IF P
Sbjct: 118 SHISFGEEEIVTPKKPATVPEVAKQRELSGTLEYQSDAKLNKQFSDAKCKELSGHNIFAP 177

Query: 179 PPEIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKK 238
           PPEI  R     R    K + D+GE            + P G            LKT+KK
Sbjct: 178 PPEIKLRPT--VRALAYKDNFDLGE----------SDTKPDGE-----------LKTAKK 214

Query: 239 IHDHKLAELTGTNIFQGN-NPPGS--AEKPLSRAKLREMTGSNIFAADAKAETKDPIRGS 295
           I D K  +L+G N+F+ + + P S  AE+ LS AKL+E++G++IF ADAKA+++D   G 
Sbjct: 215 IADRKFTDLSGNNVFKSDVSSPSSATAERLLSTAKLKEISGNDIF-ADAKAQSRDYFGGV 273

Query: 296 RQPPGGESSIALL 308
           R+PPGGESSIAL+
Sbjct: 274 RKPPGGESSIALV 286


>AT1G78150.2 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT1G35780.1); Has 35333 Blast hits to 34131
           proteins in 2444 species: Archae - 798; Bacteria -
           22429; Metazoa - 974; Fungi - 991; Plants - 531; Viruses
           - 0; Other Eukaryotes - 9610 (source: NCBI BLink). |
           chr1:29404996-29406341 FORWARD LENGTH=274
          Length = 274

 Score =  259 bits (662), Expect = 2e-69,   Method: Compositional matrix adjust.
 Identities = 156/308 (50%), Positives = 202/308 (65%), Gaps = 34/308 (11%)

Query: 1   MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
           M  +TPVRKPHTST+DLLTWSE+                +RS QPSD ISKV+ GGQ+T+
Sbjct: 1   MERSTPVRKPHTSTADLLTWSEV---PPPDSPSSASRSAVRSHQPSDGISKVVFGGQVTD 57

Query: 61  EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANSNGRTSRRLVQQAVNGISQ 120
           EE +SL + KPCS +KMKEITGSGIFS N +D  SE             + QQAVNGISQ
Sbjct: 58  EEVESLNRRKPCSEHKMKEITGSGIFSRNEKDDASEP----------LPVYQQAVNGISQ 107

Query: 121 ISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGPPP 180
           ISF  EE++SPKKPAT+PEVAKQRELSGT+++E  +K  K +S+AK KE++G +IF PPP
Sbjct: 108 ISFGEEENLSPKKPATVPEVAKQRELSGTMENESANKLQKQLSDAKYKEISGQNIFAPPP 167

Query: 181 EIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKKIH 240
           EI PRS    R    K + ++G                A  Q+ +   E   +KT+KKI+
Sbjct: 168 EIKPRS-GTNRALALKDNFNLG----------------AESQTAE---EDSSVKTAKKIY 207

Query: 241 DHKLAELTGTNIFQGNNPPGSAEKPLSRAKLREMTGSNIFAADAKAETKDPIRGSRQPPG 300
           D K AEL+G +IF+G+    + EK LS+AKL+E+ G+NIF AD K E +D + G R+PPG
Sbjct: 208 DKKFAELSGNDIFKGDAASSNVEKHLSQAKLKEIGGNNIF-ADGKVEARDYLGGVRKPPG 266

Query: 301 GESSIALL 308
           GE+SIAL+
Sbjct: 267 GETSIALV 274


>AT1G78150.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT1G35780.1); Has 152 Blast hits to 146 proteins
           in 18 species: Archae - 0; Bacteria - 0; Metazoa - 1;
           Fungi - 2; Plants - 149; Viruses - 0; Other Eukaryotes -
           0 (source: NCBI BLink). | chr1:29404996-29406341 FORWARD
           LENGTH=274
          Length = 274

 Score =  259 bits (662), Expect = 2e-69,   Method: Compositional matrix adjust.
 Identities = 156/308 (50%), Positives = 202/308 (65%), Gaps = 34/308 (11%)

Query: 1   MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
           M  +TPVRKPHTST+DLLTWSE+                +RS QPSD ISKV+ GGQ+T+
Sbjct: 1   MERSTPVRKPHTSTADLLTWSEV---PPPDSPSSASRSAVRSHQPSDGISKVVFGGQVTD 57

Query: 61  EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANSNGRTSRRLVQQAVNGISQ 120
           EE +SL + KPCS +KMKEITGSGIFS N +D  SE             + QQAVNGISQ
Sbjct: 58  EEVESLNRRKPCSEHKMKEITGSGIFSRNEKDDASEP----------LPVYQQAVNGISQ 107

Query: 121 ISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGPPP 180
           ISF  EE++SPKKPAT+PEVAKQRELSGT+++E  +K  K +S+AK KE++G +IF PPP
Sbjct: 108 ISFGEEENLSPKKPATVPEVAKQRELSGTMENESANKLQKQLSDAKYKEISGQNIFAPPP 167

Query: 181 EIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKKIH 240
           EI PRS    R    K + ++G                A  Q+ +   E   +KT+KKI+
Sbjct: 168 EIKPRS-GTNRALALKDNFNLG----------------AESQTAE---EDSSVKTAKKIY 207

Query: 241 DHKLAELTGTNIFQGNNPPGSAEKPLSRAKLREMTGSNIFAADAKAETKDPIRGSRQPPG 300
           D K AEL+G +IF+G+    + EK LS+AKL+E+ G+NIF AD K E +D + G R+PPG
Sbjct: 208 DKKFAELSGNDIFKGDAASSNVEKHLSQAKLKEIGGNNIF-ADGKVEARDYLGGVRKPPG 266

Query: 301 GESSIALL 308
           GE+SIAL+
Sbjct: 267 GETSIALV 274


>AT1G78150.3 | Symbols:  | unknown protein; FUNCTIONS IN:
           molecular_function unknown; INVOLVED IN:
           biological_process unknown; EXPRESSED IN: 23 plant
           structures; EXPRESSED DURING: 13 growth stages; BEST
           Arabidopsis thaliana protein match is: unknown protein
           (TAIR:AT1G35780.1). | chr1:29404996-29406341 FORWARD
           LENGTH=303
          Length = 303

 Score =  244 bits (624), Expect = 4e-65,   Method: Compositional matrix adjust.
 Identities = 157/337 (46%), Positives = 203/337 (60%), Gaps = 63/337 (18%)

Query: 1   MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
           M  +TPVRKPHTST+DLLTWSE+P               +RS QPSD ISKV+ GGQ+T+
Sbjct: 1   MERSTPVRKPHTSTADLLTWSEVP---PPDSPSSASRSAVRSHQPSDGISKVVFGGQVTD 57

Query: 61  EEAQSLTK-----------------------------SKPCSGYKMKEITGSGIFSANAE 91
           EE +SL +                              KPCS +KMKEITGSGIFS N +
Sbjct: 58  EEVESLNRRILDDAFDSFMRLVIYTNVKTCENVYDVIRKPCSEHKMKEITGSGIFSRNEK 117

Query: 92  DSTSEAGSANSNGRTSRRLVQQAVNGISQISFSTEESVSPKKPATIPEVAKQRELSGTLQ 151
           D  SE             + QQAVNGISQISF  EE++SPKKPAT+PEVAKQRELSGT++
Sbjct: 118 DDASEP----------LPVYQQAVNGISQISFGEEENLSPKKPATVPEVAKQRELSGTME 167

Query: 152 SELDSKSNKLISNAKTKELTGNDIFGPPPEIVPRSVAAARITESKGSKDMGEPLPRNLRT 211
           +E  +K  K +S+AK KE++G +IF PPPEI PRS    R    K + ++G         
Sbjct: 168 NESANKLQKQLSDAKYKEISGQNIFAPPPEIKPRS-GTNRALALKDNFNLG--------- 217

Query: 212 SVKVSNPAGGQSNDLFGEAPVLKTSKKIHDHKLAELTGTNIFQGNNPPGSAEKPLSRAKL 271
                  A  Q+ +   E   +KT+KKI+D K AEL+G +IF+G+    + EK LS+AKL
Sbjct: 218 -------AESQTAE---EDSSVKTAKKIYDKKFAELSGNDIFKGDAASSNVEKHLSQAKL 267

Query: 272 REMTGSNIFAADAKAETKDPIRGSRQPPGGESSIALL 308
           +E+ G+NIF AD K E +D + G R+PPGGE+SIAL+
Sbjct: 268 KEIGGNNIF-ADGKVEARDYLGGVRKPPGGETSIALV 303


>AT2G22270.1 | Symbols:  | unknown protein; BEST Arabidopsis
           thaliana protein match is: unknown protein
           (TAIR:AT4G39860.2); Has 247 Blast hits to 231 proteins
           in 42 species: Archae - 0; Bacteria - 17; Metazoa - 14;
           Fungi - 5; Plants - 145; Viruses - 0; Other Eukaryotes -
           66 (source: NCBI BLink). | chr2:9463765-9465282 FORWARD
           LENGTH=328
          Length = 328

 Score =  216 bits (550), Expect = 2e-56,   Method: Compositional matrix adjust.
 Identities = 146/331 (44%), Positives = 194/331 (58%), Gaps = 42/331 (12%)

Query: 10  PHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGG-QLTEEEAQSL-- 66
           PH ST+DLL+WSE+  PD             RS QPSD ++ VL GG Q+T  E +SL  
Sbjct: 8   PHHSTADLLSWSEIRRPDYSTAAN-------RSNQPSDGMNDVLGGGGQITNAETKSLNT 60

Query: 67  --TKSKPCSGYKMKEITGSGIFSANA-------------EDSTSE---AGSANS----NG 104
             +  K CSG+K+KE+TGS IFS +              +D  S+   +G  N+    NG
Sbjct: 61  NVSHRKNCSGHKLKEMTGSDIFSDDGKYDPNHQTRIHYHQDQLSQISFSGEENATTPMNG 120

Query: 105 R---TSRRLVQQAVNGISQISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKS-NK 160
           +     +  +    +  SQISFS EE+V+PKKP T+ E AKQ+ELS T++++ DSK   K
Sbjct: 121 KDDPNHQTRIHYHQDQRSQISFSGEENVTPKKPTTLNEAAKQKELSRTVETQADSKCKKK 180

Query: 161 LISNAKTKELTGNDIFGPPPEIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAG 220
            ISN K K ++G+DIF  P     R    A  +E KG+K+  E  PR+ R SVK SN  G
Sbjct: 181 QISNTKNKAMSGHDIFASPESQPRRLFGGATQSEVKGNKNTEESAPRSSRASVKTSN--G 238

Query: 221 GQSNDLFGEAPVLKTSKKIHDHK--LAELTGTNIFQGNN-PPGSAEKPLSRAKLREMTGS 277
             SN LF E  V+K+SKKIH+ K     LT   IF+ +  PPG +EK  S AK REM+G 
Sbjct: 239 QSSNRLFSEEHVVKSSKKIHNQKSQFQGLTSNGIFKSDKIPPGYSEKMQSSAKKREMSGH 298

Query: 278 NIFAADAKAETKDPIRGSRQPPGGESSIALL 308
           NIF AD K+E +D   G+R+PPGGESSI+L+
Sbjct: 299 NIF-ADGKSEYRDYYGGARRPPGGESSISLV 328


>AT2G22260.1 | Symbols:  | oxidoreductase, 2OG-Fe(II) oxygenase
          family protein | chr2:9461342-9463053 FORWARD
          LENGTH=314
          Length = 314

 Score = 50.8 bits (120), Expect = 1e-06,   Method: Compositional matrix adjust.
 Identities = 27/54 (50%), Positives = 35/54 (64%), Gaps = 3/54 (5%)

Query: 41 RSRQPSDRISKVLHGGQLTEEEAQSLTKSKPCSGYKMKEITGSGIFSANAEDST 94
          RS QPS   S  +  GQ+T EEA+SL   K CSG+K+KE+T S  FS N +D +
Sbjct: 12 RSNQPS---SDGISDGQITNEEAESLINKKNCSGHKLKEVTDSDTFSDNGKDDS 62