Miyakogusa Predicted Gene
- Lj2g3v0510100.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj2g3v0510100.1 tr|Q0JA54|Q0JA54_ORYSJ Os04g0615500 protein
(Fragment) OS=Oryza sativa subsp. japonica
GN=Os04g06155,34.82,0.000000000003,APO1 (ACCUMULATION OF PHOTOSYSTEM
ONE 1),NULL; EUKARYOTIC TRANSLATION INITIATION FACTOR SUI1,NULL;
A,CUFF.34647.1
(440 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT1G64810.2 | Symbols: APO1 | Arabidopsis thaliana protein of un... 518 e-147
AT1G64810.1 | Symbols: APO1 | Arabidopsis thaliana protein of un... 517 e-147
AT5G57930.2 | Symbols: APO2 | Arabidopsis thaliana protein of un... 313 2e-85
AT5G57930.1 | Symbols: APO2, emb1629 | Arabidopsis thaliana prot... 312 3e-85
AT5G61930.2 | Symbols: APO3 | Arabidopsis thaliana protein of un... 283 2e-76
AT5G61930.1 | Symbols: APO3 | Arabidopsis thaliana protein of un... 283 2e-76
AT3G21740.1 | Symbols: APO4 | Arabidopsis thaliana protein of un... 191 9e-49
>AT1G64810.2 | Symbols: APO1 | Arabidopsis thaliana protein of
unknown function (DUF794) | chr1:24086810-24088276
FORWARD LENGTH=460
Length = 460
Score = 518 bits (1333), Expect = e-147, Method: Compositional matrix adjust.
Identities = 250/413 (60%), Positives = 302/413 (73%), Gaps = 2/413 (0%)
Query: 29 LSASKSYAPALKFNCQHFHKVGSTLPGVILCANRKFRGDEAVWKREQLTQNVDXXXXXXX 88
SA SYA + + G + C N+K R + KR TQNVD
Sbjct: 47 FSARASYALCFQIPTSIPKRECLMRLGTVFCFNQKHREQTSFKKRYVSTQNVDLPPILPK 106
Query: 89 XXXXXXXXXXXXXXHAARTDWKLAHMGIEKPLEPPKNGLLVPDLVPVAYGVFDAWKLLIK 148
AR D KLA MGIEK L+PPKNGLLVP+LVPVA V D WKLLIK
Sbjct: 107 NKKKPYPIPFKQIQEEARKDKKLAQMGIEKQLDPPKNGLLVPNLVPVADQVIDNWKLLIK 166
Query: 149 GLSQLLHVIPAHGCSECSEVHVAPIGHCILDCEGPTSSQRHSSHAWVKGSINDILVPIES 208
GL+QLLHV+P CSEC VHVA +GH I DC GPT+SQR SH+WVKG+IND+L+P+ES
Sbjct: 167 GLAQLLHVVPVFACSECGAVHVANVGHNIRDCNGPTNSQRRGSHSWVKGTINDVLIPVES 226
Query: 209 YHLFDPFGRRIKHQTRFEYDRIPAVVELCIQAGVDIPEYPSRRRTNPVRILGRKILDRGG 268
YH++DPFGRRIKH+TRFEY+RIPA+VELCIQAGV+IPEYP RRRT P+R++G++++DRGG
Sbjct: 227 YHMYDPFGRRIKHETRFEYERIPALVELCIQAGVEIPEYPCRRRTQPIRMMGKRVIDRGG 286
Query: 269 HIEEP-KPWRSAE-SSSLLDFDTYRVCERFPRPSLADLPKIAEETLYAYETVRKGVRKLM 326
+ +EP KP S+ SS L + DT V ER+P P+ D+PKIA+ET+ AYE VR GV KLM
Sbjct: 287 YHKEPEKPQTSSSLSSPLAELDTLGVFERYPPPTPEDIPKIAQETMDAYEKVRLGVTKLM 346
Query: 327 KKYTVKACGYCSEVHVGPWGHNAKLCGAFKHQWRDGKHGWQDATVDEVFPPNYVWHVRDP 386
+K+TVKACGYCSEVHVGPWGH+ KLCG FKHQWRDGKHGWQDA VDEVFPPNYVWHVRD
Sbjct: 347 RKFTVKACGYCSEVHVGPWGHSVKLCGEFKHQWRDGKHGWQDALVDEVFPPNYVWHVRDL 406
Query: 387 SGPPLRSSLKRYYGKTPAVVEVCMQAGAKISDEYKPMMRLDIVIPDSDETRMI 439
G PL +L+R+YGK PA+VE+CM +GA++ YK MMRLDI++PDS E M+
Sbjct: 407 KGNPLTGNLRRFYGKAPALVEICMHSGARVPQRYKAMMRLDIIVPDSQEADMV 459
>AT1G64810.1 | Symbols: APO1 | Arabidopsis thaliana protein of
unknown function (DUF794) | chr1:24086882-24088276
FORWARD LENGTH=436
Length = 436
Score = 517 bits (1332), Expect = e-147, Method: Compositional matrix adjust.
Identities = 250/413 (60%), Positives = 302/413 (73%), Gaps = 2/413 (0%)
Query: 29 LSASKSYAPALKFNCQHFHKVGSTLPGVILCANRKFRGDEAVWKREQLTQNVDXXXXXXX 88
SA SYA + + G + C N+K R + KR TQNVD
Sbjct: 23 FSARASYALCFQIPTSIPKRECLMRLGTVFCFNQKHREQTSFKKRYVSTQNVDLPPILPK 82
Query: 89 XXXXXXXXXXXXXXHAARTDWKLAHMGIEKPLEPPKNGLLVPDLVPVAYGVFDAWKLLIK 148
AR D KLA MGIEK L+PPKNGLLVP+LVPVA V D WKLLIK
Sbjct: 83 NKKKPYPIPFKQIQEEARKDKKLAQMGIEKQLDPPKNGLLVPNLVPVADQVIDNWKLLIK 142
Query: 149 GLSQLLHVIPAHGCSECSEVHVAPIGHCILDCEGPTSSQRHSSHAWVKGSINDILVPIES 208
GL+QLLHV+P CSEC VHVA +GH I DC GPT+SQR SH+WVKG+IND+L+P+ES
Sbjct: 143 GLAQLLHVVPVFACSECGAVHVANVGHNIRDCNGPTNSQRRGSHSWVKGTINDVLIPVES 202
Query: 209 YHLFDPFGRRIKHQTRFEYDRIPAVVELCIQAGVDIPEYPSRRRTNPVRILGRKILDRGG 268
YH++DPFGRRIKH+TRFEY+RIPA+VELCIQAGV+IPEYP RRRT P+R++G++++DRGG
Sbjct: 203 YHMYDPFGRRIKHETRFEYERIPALVELCIQAGVEIPEYPCRRRTQPIRMMGKRVIDRGG 262
Query: 269 HIEEP-KPWRSAE-SSSLLDFDTYRVCERFPRPSLADLPKIAEETLYAYETVRKGVRKLM 326
+ +EP KP S+ SS L + DT V ER+P P+ D+PKIA+ET+ AYE VR GV KLM
Sbjct: 263 YHKEPEKPQTSSSLSSPLAELDTLGVFERYPPPTPEDIPKIAQETMDAYEKVRLGVTKLM 322
Query: 327 KKYTVKACGYCSEVHVGPWGHNAKLCGAFKHQWRDGKHGWQDATVDEVFPPNYVWHVRDP 386
+K+TVKACGYCSEVHVGPWGH+ KLCG FKHQWRDGKHGWQDA VDEVFPPNYVWHVRD
Sbjct: 323 RKFTVKACGYCSEVHVGPWGHSVKLCGEFKHQWRDGKHGWQDALVDEVFPPNYVWHVRDL 382
Query: 387 SGPPLRSSLKRYYGKTPAVVEVCMQAGAKISDEYKPMMRLDIVIPDSDETRMI 439
G PL +L+R+YGK PA+VE+CM +GA++ YK MMRLDI++PDS E M+
Sbjct: 383 KGNPLTGNLRRFYGKAPALVEICMHSGARVPQRYKAMMRLDIIVPDSQEADMV 435
>AT5G57930.2 | Symbols: APO2 | Arabidopsis thaliana protein of
unknown function (DUF794) | chr5:23454690-23456354
FORWARD LENGTH=443
Length = 443
Score = 313 bits (801), Expect = 2e-85, Method: Compositional matrix adjust.
Identities = 148/317 (46%), Positives = 204/317 (64%), Gaps = 3/317 (0%)
Query: 125 NGLLVPDLVPVAYGVFDAWKLLIKGLSQLLHVIPAHGCSECSEVHVAPIGHCILDCEGPT 184
NG++V LVP+AY V++A LI L +L+ V+ + C C+E+HV P GH C+GP
Sbjct: 129 NGMVVKSLVPLAYKVYNARIRLINNLHRLMKVVRVNACGWCNEIHVGPYGHPFKSCKGPN 188
Query: 185 SSQRHSSHAWVKGSINDILVPIESYHLFDPFGRRIKHQTRFEYDRIPAVVELCIQAGVDI 244
+SQR H W I D++VP+E+YHLFD G+RI+H RF R+PAVVELCIQ GV+I
Sbjct: 189 TSQRKGLHEWTNSVIEDVIVPLEAYHLFDRLGKRIRHDERFSIPRVPAVVELCIQGGVEI 248
Query: 245 PEYPSRRRTNPVRILGRKILDRGGHIEEPKPWRSAESSSLLDFDTYRVCERFPRPSLADL 304
PE+P++RR P+ +G+ E P + + V E P S +
Sbjct: 249 PEFPAKRRRKPIIRIGKSEFVDADETELPD--PEPQPPPVPLLTELPVSEITPPSSEEET 306
Query: 305 PKIAEETLYAYETVRKGVRKLMKKYTVKACGYCSEVHVGPWGHNAKLCGAFKHQWRDGKH 364
+AEETL A+E +R G +KLM+ Y V+ CGYC EVHVGP GH A+ CGAFKHQ R+G+H
Sbjct: 307 VSLAEETLQAWEEMRAGAKKLMRMYRVRVCGYCPEVHVGPTGHKAQNCGAFKHQQRNGQH 366
Query: 365 GWQDATVDEVFPPNYVWHVRDPSGPPLRSSLKRYYGKTPAVVEVCMQAGAKISDEYKPMM 424
GWQ A +D++ PP YVWHV D +GPP++ L+ +YG+ PAVVE+C QAGA + + Y+ M
Sbjct: 367 GWQSAVLDDLIPPRYVWHVPDVNGPPMQRELRSFYGQAPAVVEICAQAGAVVPEHYRATM 426
Query: 425 RLDIVIPDS-DETRMII 440
RL++ IP S E M++
Sbjct: 427 RLEVGIPSSVKEAEMVV 443
>AT5G57930.1 | Symbols: APO2, emb1629 | Arabidopsis thaliana protein
of unknown function (DUF794) | chr5:23454690-23456354
FORWARD LENGTH=440
Length = 440
Score = 312 bits (800), Expect = 3e-85, Method: Compositional matrix adjust.
Identities = 148/317 (46%), Positives = 204/317 (64%), Gaps = 3/317 (0%)
Query: 125 NGLLVPDLVPVAYGVFDAWKLLIKGLSQLLHVIPAHGCSECSEVHVAPIGHCILDCEGPT 184
NG++V LVP+AY V++A LI L +L+ V+ + C C+E+HV P GH C+GP
Sbjct: 126 NGMVVKSLVPLAYKVYNARIRLINNLHRLMKVVRVNACGWCNEIHVGPYGHPFKSCKGPN 185
Query: 185 SSQRHSSHAWVKGSINDILVPIESYHLFDPFGRRIKHQTRFEYDRIPAVVELCIQAGVDI 244
+SQR H W I D++VP+E+YHLFD G+RI+H RF R+PAVVELCIQ GV+I
Sbjct: 186 TSQRKGLHEWTNSVIEDVIVPLEAYHLFDRLGKRIRHDERFSIPRVPAVVELCIQGGVEI 245
Query: 245 PEYPSRRRTNPVRILGRKILDRGGHIEEPKPWRSAESSSLLDFDTYRVCERFPRPSLADL 304
PE+P++RR P+ +G+ E P + + V E P S +
Sbjct: 246 PEFPAKRRRKPIIRIGKSEFVDADETELPD--PEPQPPPVPLLTELPVSEITPPSSEEET 303
Query: 305 PKIAEETLYAYETVRKGVRKLMKKYTVKACGYCSEVHVGPWGHNAKLCGAFKHQWRDGKH 364
+AEETL A+E +R G +KLM+ Y V+ CGYC EVHVGP GH A+ CGAFKHQ R+G+H
Sbjct: 304 VSLAEETLQAWEEMRAGAKKLMRMYRVRVCGYCPEVHVGPTGHKAQNCGAFKHQQRNGQH 363
Query: 365 GWQDATVDEVFPPNYVWHVRDPSGPPLRSSLKRYYGKTPAVVEVCMQAGAKISDEYKPMM 424
GWQ A +D++ PP YVWHV D +GPP++ L+ +YG+ PAVVE+C QAGA + + Y+ M
Sbjct: 364 GWQSAVLDDLIPPRYVWHVPDVNGPPMQRELRSFYGQAPAVVEICAQAGAVVPEHYRATM 423
Query: 425 RLDIVIPDS-DETRMII 440
RL++ IP S E M++
Sbjct: 424 RLEVGIPSSVKEAEMVV 440
>AT5G61930.2 | Symbols: APO3 | Arabidopsis thaliana protein of
unknown function (DUF794) | chr5:24866230-24867665
REVERSE LENGTH=402
Length = 402
Score = 283 bits (724), Expect = 2e-76, Method: Compositional matrix adjust.
Identities = 140/321 (43%), Positives = 206/321 (64%), Gaps = 8/321 (2%)
Query: 121 EPPKNGLLVPDLVPVAYGVFDAWKLLIKGLSQLLHVIPAHGCSECSEVHVAPIGHCILDC 180
+PP NGLLVP+LV VA+ V +L+ GLS+++H +P H C C+EVH+ GH I C
Sbjct: 87 DPPDNGLLVPELVDVAHCVHRCRNMLLSGLSKIIHHVPVHRCRLCAEVHIGKQGHEIRTC 146
Query: 181 EGPTSSQRHSSHAWVKGSINDILVPIESYHLFDPFGR-RIKHQTRFEYDRIPAVVELCIQ 239
GP S R ++H W +G ++D+++ + +HL+D + R+ H RF +I AV+ELCIQ
Sbjct: 147 TGPGSGSRSATHVWKRGRVSDVVLFPKCFHLYDRAVKPRVIHDERFTVPKISAVLELCIQ 206
Query: 240 AGVDIPEYPSRRRTNPVRILGRKILDRGGHIEEPKPWRSAESSSLLDFDTYRVCERFPRP 299
AGVD+ ++PS+RR+ PV + +I+D + +++L+ D E+
Sbjct: 207 AGVDLEKFPSKRRSKPVYSIEGRIVDFEDVNDGNSELAVTSTTTLIQEDDRCKEEK---- 262
Query: 300 SLADLPKIAEETLYAYETVRKGVRKLMKKYTVKACGYCSEVHVGPWGHNAKLCGAFKHQW 359
L +++ ET+ ++ + GVRKLM++Y V CGYC E+ VGP GH ++C A KHQ
Sbjct: 263 --KSLKELSFETMESWFEMVLGVRKLMERYRVWTCGYCPEIQVGPKGHKVRMCKATKHQM 320
Query: 360 RDGKHGWQDATVDEVFPPNYVWHVRDPS-GPPLRSSLKRYYGKTPAVVEVCMQAGAKISD 418
RDG H WQ+AT+D+V P YVWHVRDP+ G L +SLKR+YGK PAV+E+C+Q GA + D
Sbjct: 321 RDGMHAWQEATIDDVVGPTYVWHVRDPTDGSVLDNSLKRFYGKAPAVIEMCVQGGAPVPD 380
Query: 419 EYKPMMRLDIVIPDSDETRMI 439
+Y MMRLD+V P DE ++
Sbjct: 381 QYNSMMRLDVVYPQRDEVDLV 401
>AT5G61930.1 | Symbols: APO3 | Arabidopsis thaliana protein of
unknown function (DUF794) | chr5:24866230-24867665
REVERSE LENGTH=402
Length = 402
Score = 283 bits (724), Expect = 2e-76, Method: Compositional matrix adjust.
Identities = 140/321 (43%), Positives = 206/321 (64%), Gaps = 8/321 (2%)
Query: 121 EPPKNGLLVPDLVPVAYGVFDAWKLLIKGLSQLLHVIPAHGCSECSEVHVAPIGHCILDC 180
+PP NGLLVP+LV VA+ V +L+ GLS+++H +P H C C+EVH+ GH I C
Sbjct: 87 DPPDNGLLVPELVDVAHCVHRCRNMLLSGLSKIIHHVPVHRCRLCAEVHIGKQGHEIRTC 146
Query: 181 EGPTSSQRHSSHAWVKGSINDILVPIESYHLFDPFGR-RIKHQTRFEYDRIPAVVELCIQ 239
GP S R ++H W +G ++D+++ + +HL+D + R+ H RF +I AV+ELCIQ
Sbjct: 147 TGPGSGSRSATHVWKRGRVSDVVLFPKCFHLYDRAVKPRVIHDERFTVPKISAVLELCIQ 206
Query: 240 AGVDIPEYPSRRRTNPVRILGRKILDRGGHIEEPKPWRSAESSSLLDFDTYRVCERFPRP 299
AGVD+ ++PS+RR+ PV + +I+D + +++L+ D E+
Sbjct: 207 AGVDLEKFPSKRRSKPVYSIEGRIVDFEDVNDGNSELAVTSTTTLIQEDDRCKEEK---- 262
Query: 300 SLADLPKIAEETLYAYETVRKGVRKLMKKYTVKACGYCSEVHVGPWGHNAKLCGAFKHQW 359
L +++ ET+ ++ + GVRKLM++Y V CGYC E+ VGP GH ++C A KHQ
Sbjct: 263 --KSLKELSFETMESWFEMVLGVRKLMERYRVWTCGYCPEIQVGPKGHKVRMCKATKHQM 320
Query: 360 RDGKHGWQDATVDEVFPPNYVWHVRDPS-GPPLRSSLKRYYGKTPAVVEVCMQAGAKISD 418
RDG H WQ+AT+D+V P YVWHVRDP+ G L +SLKR+YGK PAV+E+C+Q GA + D
Sbjct: 321 RDGMHAWQEATIDDVVGPTYVWHVRDPTDGSVLDNSLKRFYGKAPAVIEMCVQGGAPVPD 380
Query: 419 EYKPMMRLDIVIPDSDETRMI 439
+Y MMRLD+V P DE ++
Sbjct: 381 QYNSMMRLDVVYPQRDEVDLV 401
>AT3G21740.1 | Symbols: APO4 | Arabidopsis thaliana protein of
unknown function (DUF794) | chr3:7662542-7663638 REVERSE
LENGTH=337
Length = 337
Score = 191 bits (485), Expect = 9e-49, Method: Compositional matrix adjust.
Identities = 117/310 (37%), Positives = 149/310 (48%), Gaps = 31/310 (10%)
Query: 116 IEKPLEPPKNGLLVPDLVPVAYGVFDAWKLLIKGLSQLLHVIPAHGCSECSEVHVAPIGH 175
I K +E V ++VPVA + A K LI ++ LL V P C CSEV V GH
Sbjct: 42 ILKRIENRAKDYPVKEIVPVAEEILIARKNLISNIAALLKVFPVLTCKFCSEVFVGKEGH 101
Query: 176 CILDCEGPTSSQRHSSHAWVKGSINDILVPIESYHLFDPFGRRIKHQTRFEYDRIPAVVE 235
I C + H WV GSINDILVP+ESYHL + I+HQ RF+YDR+PA++E
Sbjct: 102 LIETCRSYIRRGNNRLHEWVPGSINDILVPVESYHLHNISQGVIRHQERFDYDRVPAILE 161
Query: 236 LCIQAGVDIPEYPSRRRTNPVRILGRKILDRGGHIEEPKPWRSAESSSLLDFDTYRVCER 295
LC QAG PE IL + I E E
Sbjct: 162 LCCQAGAIHPE----------EILQYSEIHDNPQISE---------------------ED 190
Query: 296 FPRPSLADLPKIAEETLYAYETVRKGVRKLMKKYTVKACGYCSEVHVGPWGHNAKLCGAF 355
DL + L A+E VR GV+KL+ Y K C C EVHVGP GH A+LCG F
Sbjct: 191 IRSLPAGDLKYVGANALMAWEKVRAGVKKLLLVYPSKVCKRCKEVHVGPSGHKARLCGVF 250
Query: 356 KHQWRDGKHGWQDATVDEVFPPNYVWHVRDPSGPPLRSSLKRYYGKTPAVVEVCMQAGAK 415
K++ G H W+ A V+++ P VWH R L + YYG PA+V +C GA
Sbjct: 251 KYESWRGTHYWEKAGVNDLVPEKMVWHRRPQDPVVLVDEGRSYYGHAPAIVSLCSHTGAI 310
Query: 416 ISDEYKPMMR 425
+ +Y M+
Sbjct: 311 VPVKYACKMK 320