Miyakogusa Predicted Gene
- Lj0g3v0361759.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj0g3v0361759.1 Non Chatacterized Hit- tr|I1MYS4|I1MYS4_SOYBN
Uncharacterized protein OS=Glycine max PE=4 SV=1,80.19,0,SUBFAMILY NOT
NAMED,NULL; FAMILY NOT NAMED,NULL; seg,NULL; DUF4057,Domain of unknown
function DUF405,CUFF.24971.1
(308 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT4G39860.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 353 1e-97
AT4G39860.2 | Symbols: | unknown protein; BEST Arabidopsis thal... 346 1e-95
AT1G35780.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 265 3e-71
AT1G78150.2 | Symbols: | unknown protein; BEST Arabidopsis thal... 259 2e-69
AT1G78150.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 259 2e-69
AT1G78150.3 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 244 4e-65
AT2G22270.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 216 2e-56
AT2G22260.1 | Symbols: | oxidoreductase, 2OG-Fe(II) oxygenase f... 51 1e-06
>AT4G39860.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT2G22270.1); Has 152 Blast hits to 146 proteins
in 19 species: Archae - 0; Bacteria - 0; Metazoa - 0;
Fungi - 2; Plants - 146; Viruses - 0; Other Eukaryotes -
4 (source: NCBI BLink). | chr4:18499909-18501472 FORWARD
LENGTH=299
Length = 299
Score = 353 bits (905), Expect = 1e-97, Method: Compositional matrix adjust.
Identities = 186/308 (60%), Positives = 228/308 (74%), Gaps = 9/308 (2%)
Query: 1 MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
M NTPVR PHTST+DLL+WSE RS QPSD ISK+L GGQ+T+
Sbjct: 1 MERNTPVRNPHTSTADLLSWSE-----TPPPPHHSTPSAARSHQPSDGISKILGGGQITD 55
Query: 61 EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANSNGRTSRRLVQQAVNGISQ 120
EEAQSL K K CSGYK+KE+TGSGIF+ + A ++ +T R QQ +NG+SQ
Sbjct: 56 EEAQSLNKLKNCSGYKLKEMTGSGIFTDKGK--VGSESDATTDPKTGLRYYQQTLNGMSQ 113
Query: 121 ISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGPPP 180
ISFS + +VSPKKP T+ EVAKQRELSG L +E D KSNK IS+AK +E++G+DIF PP
Sbjct: 114 ISFSADGNVSPKKPTTLTEVAKQRELSGNLLTEADLKSNKQISSAKIEEISGHDIFAPPS 173
Query: 181 EIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKKIH 240
EI PRS+ AA+ E++G++DMGEP PRNLRTSVKVSNPAGGQSN LF E PV+KTSKKIH
Sbjct: 174 EIQPRSLVAAQ-QEARGNRDMGEPAPRNLRTSVKVSNPAGGQSNILFSEEPVVKTSKKIH 232
Query: 241 DHKLAELTGTNIFQGNNPPGSAEKPLSRAKLREMTGSNIFAADAKAETKDPIRGSRQPPG 300
+ K ELTG IF+G+ PGSA+K LS AKLREM+G+NIF AD K+E++D G R+PPG
Sbjct: 233 NQKFQELTGNGIFKGDESPGSADKQLSSAKLREMSGNNIF-ADGKSESRDYFGGVRKPPG 291
Query: 301 GESSIALL 308
GESSI+L+
Sbjct: 292 GESSISLV 299
>AT4G39860.2 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT2G22270.1); Has 148 Blast hits to 144 proteins
in 18 species: Archae - 0; Bacteria - 0; Metazoa - 0;
Fungi - 0; Plants - 144; Viruses - 0; Other Eukaryotes -
4 (source: NCBI BLink). | chr4:18499909-18501472 FORWARD
LENGTH=298
Length = 298
Score = 346 bits (887), Expect = 1e-95, Method: Compositional matrix adjust.
Identities = 185/308 (60%), Positives = 227/308 (73%), Gaps = 10/308 (3%)
Query: 1 MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
M NTPVR PHTST+DLL+WSE RS QPSD ISK+L GGQ+T+
Sbjct: 1 MERNTPVRNPHTSTADLLSWSE-----TPPPPHHSTPSAARSHQPSDGISKILGGGQITD 55
Query: 61 EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANSNGRTSRRLVQQAVNGISQ 120
EEAQSL K K CSGYK+KE+TGSGIF+ + A ++ +T R Q +NG+SQ
Sbjct: 56 EEAQSLNKLKNCSGYKLKEMTGSGIFTDKGK--VGSESDATTDPKTGLRYYQ-TLNGMSQ 112
Query: 121 ISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGPPP 180
ISFS + +VSPKKP T+ EVAKQRELSG L +E D KSNK IS+AK +E++G+DIF PP
Sbjct: 113 ISFSADGNVSPKKPTTLTEVAKQRELSGNLLTEADLKSNKQISSAKIEEISGHDIFAPPS 172
Query: 181 EIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKKIH 240
EI PRS+ AA+ E++G++DMGEP PRNLRTSVKVSNPAGGQSN LF E PV+KTSKKIH
Sbjct: 173 EIQPRSLVAAQ-QEARGNRDMGEPAPRNLRTSVKVSNPAGGQSNILFSEEPVVKTSKKIH 231
Query: 241 DHKLAELTGTNIFQGNNPPGSAEKPLSRAKLREMTGSNIFAADAKAETKDPIRGSRQPPG 300
+ K ELTG IF+G+ PGSA+K LS AKLREM+G+NIF AD K+E++D G R+PPG
Sbjct: 232 NQKFQELTGNGIFKGDESPGSADKQLSSAKLREMSGNNIF-ADGKSESRDYFGGVRKPPG 290
Query: 301 GESSIALL 308
GESSI+L+
Sbjct: 291 GESSISLV 298
>AT1G35780.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT1G78150.2); Has 145 Blast hits to 144 proteins
in 16 species: Archae - 0; Bacteria - 0; Metazoa - 0;
Fungi - 0; Plants - 145; Viruses - 0; Other Eukaryotes -
0 (source: NCBI BLink). | chr1:13277778-13280113 REVERSE
LENGTH=286
Length = 286
Score = 265 bits (677), Expect = 3e-71, Method: Compositional matrix adjust.
Identities = 161/313 (51%), Positives = 199/313 (63%), Gaps = 32/313 (10%)
Query: 1 MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
M NTPVRKPH ST+DLLTW E P RS QPSD ISKV+ GGQ+T+
Sbjct: 1 MEKNTPVRKPHMSTADLLTWPENQP--FESPAAVSSRSAARSHQPSDGISKVVFGGQVTD 58
Query: 61 EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANS--NGRTSRRLVQQAVNGI 118
EE +SL K KPCS YKMKEITGSGIFS E+ SE SANS NG+ SR Q +
Sbjct: 59 EEVESLNKRKPCSNYKMKEITGSGIFSVYEENDDSELASANSATNGK-SRTFQQPPAAIM 117
Query: 119 SQISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGP 178
S ISF EE V+PKKPAT+PEVAKQRELSGTL+ + D+K NK S+AK KEL+G++IF P
Sbjct: 118 SHISFGEEEIVTPKKPATVPEVAKQRELSGTLEYQSDAKLNKQFSDAKCKELSGHNIFAP 177
Query: 179 PPEIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKK 238
PPEI R R K + D+GE + P G LKT+KK
Sbjct: 178 PPEIKLRPT--VRALAYKDNFDLGE----------SDTKPDGE-----------LKTAKK 214
Query: 239 IHDHKLAELTGTNIFQGN-NPPGS--AEKPLSRAKLREMTGSNIFAADAKAETKDPIRGS 295
I D K +L+G N+F+ + + P S AE+ LS AKL+E++G++IF ADAKA+++D G
Sbjct: 215 IADRKFTDLSGNNVFKSDVSSPSSATAERLLSTAKLKEISGNDIF-ADAKAQSRDYFGGV 273
Query: 296 RQPPGGESSIALL 308
R+PPGGESSIAL+
Sbjct: 274 RKPPGGESSIALV 286
>AT1G78150.2 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT1G35780.1); Has 35333 Blast hits to 34131
proteins in 2444 species: Archae - 798; Bacteria -
22429; Metazoa - 974; Fungi - 991; Plants - 531; Viruses
- 0; Other Eukaryotes - 9610 (source: NCBI BLink). |
chr1:29404996-29406341 FORWARD LENGTH=274
Length = 274
Score = 259 bits (662), Expect = 2e-69, Method: Compositional matrix adjust.
Identities = 156/308 (50%), Positives = 202/308 (65%), Gaps = 34/308 (11%)
Query: 1 MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
M +TPVRKPHTST+DLLTWSE+ +RS QPSD ISKV+ GGQ+T+
Sbjct: 1 MERSTPVRKPHTSTADLLTWSEV---PPPDSPSSASRSAVRSHQPSDGISKVVFGGQVTD 57
Query: 61 EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANSNGRTSRRLVQQAVNGISQ 120
EE +SL + KPCS +KMKEITGSGIFS N +D SE + QQAVNGISQ
Sbjct: 58 EEVESLNRRKPCSEHKMKEITGSGIFSRNEKDDASEP----------LPVYQQAVNGISQ 107
Query: 121 ISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGPPP 180
ISF EE++SPKKPAT+PEVAKQRELSGT+++E +K K +S+AK KE++G +IF PPP
Sbjct: 108 ISFGEEENLSPKKPATVPEVAKQRELSGTMENESANKLQKQLSDAKYKEISGQNIFAPPP 167
Query: 181 EIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKKIH 240
EI PRS R K + ++G A Q+ + E +KT+KKI+
Sbjct: 168 EIKPRS-GTNRALALKDNFNLG----------------AESQTAE---EDSSVKTAKKIY 207
Query: 241 DHKLAELTGTNIFQGNNPPGSAEKPLSRAKLREMTGSNIFAADAKAETKDPIRGSRQPPG 300
D K AEL+G +IF+G+ + EK LS+AKL+E+ G+NIF AD K E +D + G R+PPG
Sbjct: 208 DKKFAELSGNDIFKGDAASSNVEKHLSQAKLKEIGGNNIF-ADGKVEARDYLGGVRKPPG 266
Query: 301 GESSIALL 308
GE+SIAL+
Sbjct: 267 GETSIALV 274
>AT1G78150.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT1G35780.1); Has 152 Blast hits to 146 proteins
in 18 species: Archae - 0; Bacteria - 0; Metazoa - 1;
Fungi - 2; Plants - 149; Viruses - 0; Other Eukaryotes -
0 (source: NCBI BLink). | chr1:29404996-29406341 FORWARD
LENGTH=274
Length = 274
Score = 259 bits (662), Expect = 2e-69, Method: Compositional matrix adjust.
Identities = 156/308 (50%), Positives = 202/308 (65%), Gaps = 34/308 (11%)
Query: 1 MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
M +TPVRKPHTST+DLLTWSE+ +RS QPSD ISKV+ GGQ+T+
Sbjct: 1 MERSTPVRKPHTSTADLLTWSEV---PPPDSPSSASRSAVRSHQPSDGISKVVFGGQVTD 57
Query: 61 EEAQSLTKSKPCSGYKMKEITGSGIFSANAEDSTSEAGSANSNGRTSRRLVQQAVNGISQ 120
EE +SL + KPCS +KMKEITGSGIFS N +D SE + QQAVNGISQ
Sbjct: 58 EEVESLNRRKPCSEHKMKEITGSGIFSRNEKDDASEP----------LPVYQQAVNGISQ 107
Query: 121 ISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKSNKLISNAKTKELTGNDIFGPPP 180
ISF EE++SPKKPAT+PEVAKQRELSGT+++E +K K +S+AK KE++G +IF PPP
Sbjct: 108 ISFGEEENLSPKKPATVPEVAKQRELSGTMENESANKLQKQLSDAKYKEISGQNIFAPPP 167
Query: 181 EIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAGGQSNDLFGEAPVLKTSKKIH 240
EI PRS R K + ++G A Q+ + E +KT+KKI+
Sbjct: 168 EIKPRS-GTNRALALKDNFNLG----------------AESQTAE---EDSSVKTAKKIY 207
Query: 241 DHKLAELTGTNIFQGNNPPGSAEKPLSRAKLREMTGSNIFAADAKAETKDPIRGSRQPPG 300
D K AEL+G +IF+G+ + EK LS+AKL+E+ G+NIF AD K E +D + G R+PPG
Sbjct: 208 DKKFAELSGNDIFKGDAASSNVEKHLSQAKLKEIGGNNIF-ADGKVEARDYLGGVRKPPG 266
Query: 301 GESSIALL 308
GE+SIAL+
Sbjct: 267 GETSIALV 274
>AT1G78150.3 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; EXPRESSED IN: 23 plant
structures; EXPRESSED DURING: 13 growth stages; BEST
Arabidopsis thaliana protein match is: unknown protein
(TAIR:AT1G35780.1). | chr1:29404996-29406341 FORWARD
LENGTH=303
Length = 303
Score = 244 bits (624), Expect = 4e-65, Method: Compositional matrix adjust.
Identities = 157/337 (46%), Positives = 203/337 (60%), Gaps = 63/337 (18%)
Query: 1 MHGNTPVRKPHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGGQLTE 60
M +TPVRKPHTST+DLLTWSE+P +RS QPSD ISKV+ GGQ+T+
Sbjct: 1 MERSTPVRKPHTSTADLLTWSEVP---PPDSPSSASRSAVRSHQPSDGISKVVFGGQVTD 57
Query: 61 EEAQSLTK-----------------------------SKPCSGYKMKEITGSGIFSANAE 91
EE +SL + KPCS +KMKEITGSGIFS N +
Sbjct: 58 EEVESLNRRILDDAFDSFMRLVIYTNVKTCENVYDVIRKPCSEHKMKEITGSGIFSRNEK 117
Query: 92 DSTSEAGSANSNGRTSRRLVQQAVNGISQISFSTEESVSPKKPATIPEVAKQRELSGTLQ 151
D SE + QQAVNGISQISF EE++SPKKPAT+PEVAKQRELSGT++
Sbjct: 118 DDASEP----------LPVYQQAVNGISQISFGEEENLSPKKPATVPEVAKQRELSGTME 167
Query: 152 SELDSKSNKLISNAKTKELTGNDIFGPPPEIVPRSVAAARITESKGSKDMGEPLPRNLRT 211
+E +K K +S+AK KE++G +IF PPPEI PRS R K + ++G
Sbjct: 168 NESANKLQKQLSDAKYKEISGQNIFAPPPEIKPRS-GTNRALALKDNFNLG--------- 217
Query: 212 SVKVSNPAGGQSNDLFGEAPVLKTSKKIHDHKLAELTGTNIFQGNNPPGSAEKPLSRAKL 271
A Q+ + E +KT+KKI+D K AEL+G +IF+G+ + EK LS+AKL
Sbjct: 218 -------AESQTAE---EDSSVKTAKKIYDKKFAELSGNDIFKGDAASSNVEKHLSQAKL 267
Query: 272 REMTGSNIFAADAKAETKDPIRGSRQPPGGESSIALL 308
+E+ G+NIF AD K E +D + G R+PPGGE+SIAL+
Sbjct: 268 KEIGGNNIF-ADGKVEARDYLGGVRKPPGGETSIALV 303
>AT2G22270.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT4G39860.2); Has 247 Blast hits to 231 proteins
in 42 species: Archae - 0; Bacteria - 17; Metazoa - 14;
Fungi - 5; Plants - 145; Viruses - 0; Other Eukaryotes -
66 (source: NCBI BLink). | chr2:9463765-9465282 FORWARD
LENGTH=328
Length = 328
Score = 216 bits (550), Expect = 2e-56, Method: Compositional matrix adjust.
Identities = 146/331 (44%), Positives = 194/331 (58%), Gaps = 42/331 (12%)
Query: 10 PHTSTSDLLTWSELPPPDXXXXXXXXXXXGIRSRQPSDRISKVLHGG-QLTEEEAQSL-- 66
PH ST+DLL+WSE+ PD RS QPSD ++ VL GG Q+T E +SL
Sbjct: 8 PHHSTADLLSWSEIRRPDYSTAAN-------RSNQPSDGMNDVLGGGGQITNAETKSLNT 60
Query: 67 --TKSKPCSGYKMKEITGSGIFSANA-------------EDSTSE---AGSANS----NG 104
+ K CSG+K+KE+TGS IFS + +D S+ +G N+ NG
Sbjct: 61 NVSHRKNCSGHKLKEMTGSDIFSDDGKYDPNHQTRIHYHQDQLSQISFSGEENATTPMNG 120
Query: 105 R---TSRRLVQQAVNGISQISFSTEESVSPKKPATIPEVAKQRELSGTLQSELDSKS-NK 160
+ + + + SQISFS EE+V+PKKP T+ E AKQ+ELS T++++ DSK K
Sbjct: 121 KDDPNHQTRIHYHQDQRSQISFSGEENVTPKKPTTLNEAAKQKELSRTVETQADSKCKKK 180
Query: 161 LISNAKTKELTGNDIFGPPPEIVPRSVAAARITESKGSKDMGEPLPRNLRTSVKVSNPAG 220
ISN K K ++G+DIF P R A +E KG+K+ E PR+ R SVK SN G
Sbjct: 181 QISNTKNKAMSGHDIFASPESQPRRLFGGATQSEVKGNKNTEESAPRSSRASVKTSN--G 238
Query: 221 GQSNDLFGEAPVLKTSKKIHDHK--LAELTGTNIFQGNN-PPGSAEKPLSRAKLREMTGS 277
SN LF E V+K+SKKIH+ K LT IF+ + PPG +EK S AK REM+G
Sbjct: 239 QSSNRLFSEEHVVKSSKKIHNQKSQFQGLTSNGIFKSDKIPPGYSEKMQSSAKKREMSGH 298
Query: 278 NIFAADAKAETKDPIRGSRQPPGGESSIALL 308
NIF AD K+E +D G+R+PPGGESSI+L+
Sbjct: 299 NIF-ADGKSEYRDYYGGARRPPGGESSISLV 328
>AT2G22260.1 | Symbols: | oxidoreductase, 2OG-Fe(II) oxygenase
family protein | chr2:9461342-9463053 FORWARD
LENGTH=314
Length = 314
Score = 50.8 bits (120), Expect = 1e-06, Method: Compositional matrix adjust.
Identities = 27/54 (50%), Positives = 35/54 (64%), Gaps = 3/54 (5%)
Query: 41 RSRQPSDRISKVLHGGQLTEEEAQSLTKSKPCSGYKMKEITGSGIFSANAEDST 94
RS QPS S + GQ+T EEA+SL K CSG+K+KE+T S FS N +D +
Sbjct: 12 RSNQPS---SDGISDGQITNEEAESLINKKNCSGHKLKEVTDSDTFSDNGKDDS 62