Miyakogusa Predicted Gene
- Lj0g3v0356229.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj0g3v0356229.1 Non Chatacterized Hit- tr|C5XFJ9|C5XFJ9_SORBI
Putative uncharacterized protein Sb03g043330
OS=Sorghu,28.35,7e-19,seg,NULL; LEA_2,Late embryogenesis abundant
protein, LEA-14,CUFF.24522.1
(247 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT4G26490.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 240 8e-64
AT5G56050.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 226 8e-60
AT3G26350.1 | Symbols: | LOCATED IN: chloroplast; EXPRESSED IN:... 139 1e-33
AT1G13050.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 124 7e-29
AT1G13050.2 | Symbols: | unknown protein; FUNCTIONS IN: molecul... 119 1e-27
AT5G22870.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 56 2e-08
AT5G56070.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 56 2e-08
AT3G11650.1 | Symbols: NHL2 | NDR1/HIN1-like 2 | chr3:3676264-36... 53 2e-07
AT2G35460.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 52 3e-07
AT3G52470.1 | Symbols: | Late embryogenesis abundant (LEA) hydr... 50 2e-06
AT3G11660.1 | Symbols: NHL1 | NDR1/HIN1-like 1 | chr3:3679031-36... 47 8e-06
>AT4G26490.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr4:13380425-13381231 FORWARD LENGTH=268
Length = 268
Score = 240 bits (612), Expect = 8e-64, Method: Compositional matrix adjust.
Identities = 115/233 (49%), Positives = 162/233 (69%), Gaps = 9/233 (3%)
Query: 18 TKPLSLDQIVISKQPT---NHLSLEPNSSNSAKTKLSRPPALRFQRTNPIIWFASVLCLI 74
T+ + Q+V++K T N L EP + L +P R RT+ IW + C +
Sbjct: 42 TQSTPVGQMVLTKPATVRFNGLDAEPRKD---RVILRQP---RSSRTSLWIWCVAGFCFV 95
Query: 75 FSLVLIFFGVATLTIFLGIKPRNPTFDIPNANLNALYFDSPQYFNGDFTLLANFTNPNTK 134
FSL+LIFF +ATL +FL I+PR P FDIPNANL+ +YFD+P++FNGD ++L NFTNPN K
Sbjct: 96 FSLLLIFFAIATLIVFLAIRPRIPVFDIPNANLHTIYFDTPEFFNGDLSMLVNFTNPNKK 155
Query: 135 IDVSFESLDIELFFSDRIISSQSIEPFTQRRRESRLQSLHFISSLVFLPKDLGVMLEKQV 194
I+V FE L IELFF +R+I++Q ++PF Q++ E+RL+ + ISSLV LP + V L +Q+
Sbjct: 156 IEVKFEKLRIELFFFNRLIAAQVVQPFLQKKHETRLEPIRLISSLVGLPVNHAVELRRQL 215
Query: 195 QSNLVNYNVRGTFKVRVTLGLIHLSYLLHSRCQIEMTSPPTGGLVARKCITKR 247
++N + Y +RGTFKV+ G+IH SY LH RCQ++MT PPTG L++R C TK+
Sbjct: 216 ENNKIEYEIRGTFKVKAHFGMIHYSYQLHGRCQLQMTGPPTGILISRNCTTKK 268
>AT5G56050.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast; BEST Arabidopsis thaliana protein match is:
Late embryogenesis abundant (LEA) hydroxyproline-rich
glycoprotein family (TAIR:AT4G26490.1); Has 1807 Blast
hits to 1807 proteins in 277 species: Archae - 0;
Bacteria - 0; Metazoa - 736; Fungi - 347; Plants - 385;
Viruses - 0; Other Eukaryotes - 339 (source: NCBI
BLink). | chr5:22701167-22702018 REVERSE LENGTH=283
Length = 283
Score = 226 bits (577), Expect = 8e-60, Method: Compositional matrix adjust.
Identities = 110/238 (46%), Positives = 155/238 (65%), Gaps = 17/238 (7%)
Query: 21 LSLDQIVISKQPTNHLSLEPNSSNSAKTKLSRPPA-----------LRFQRTNPIIWFAS 69
++L ++++SK P + N + A KL A LR RTNP IW +
Sbjct: 51 IALTEVIVSKSPLS------NQKSPATPKLDSMEAHPLHETMVLLQLRTSRTNPWIWCGA 104
Query: 70 VLCLIFSLVLIFFGVATLTIFLGIKPRNPTFDIPNANLNALYFDSPQYFNGDFTLLANFT 129
LC IFS++LI FG+ATL ++L +KPR P FDI NA LN + F+SP YFNGD L NFT
Sbjct: 105 ALCFIFSILLIVFGIATLILYLAVKPRTPVFDISNAKLNTILFESPVYFNGDMLLQLNFT 164
Query: 130 NPNTKIDVSFESLDIELFFSDRIISSQSIEPFTQRRRESRLQSLHFISSLVFLPKDLGVM 189
NPN K++V FE+L +EL+F+D I++Q + PF+QR ++RL+ + IS+LVFLP + +
Sbjct: 165 NPNKKLNVRFENLMVELWFADTKIATQGVLPFSQRNGKTRLEPIRLISNLVFLPVNHILE 224
Query: 190 LEKQVQSNLVNYNVRGTFKVRVTLGLIHLSYLLHSRCQIEMTSPPTGGLVARKCITKR 247
L +QV SN + Y +R F+V+ G+IH SY+LH CQ++++SPP GGLV R C TKR
Sbjct: 225 LRRQVTSNRIAYEIRSNFRVKAIFGMIHYSYMLHGICQLQLSSPPAGGLVYRNCTTKR 282
>AT3G26350.1 | Symbols: | LOCATED IN: chloroplast; EXPRESSED IN:
root, pedicel, carpel, stamen; EXPRESSED DURING: 4
anthesis, petal differentiation and expansion stage;
CONTAINS InterPro DOMAIN/s: Late embryogenesis abundant
protein, group 2 (InterPro:IPR004864); BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT1G13050.1); Has 3534 Blast hits to 2704 proteins
in 342 species: Archae - 6; Bacteria - 192; Metazoa -
1076; Fungi - 505; Plants - 1162; Viruses - 224; Other
Eukaryotes - 369 (source: NCBI BLink). |
chr3:9653660-9654730 REVERSE LENGTH=356
Length = 356
Score = 139 bits (351), Expect = 1e-33, Method: Compositional matrix adjust.
Identities = 78/196 (39%), Positives = 113/196 (57%), Gaps = 3/196 (1%)
Query: 53 PPALRFQRTNPIIWFASVLCLIFSLVLIFFGVATLTIFLGIKPRNPTFDIPNANLNALYF 112
PP R TN + W A+ C IF ++LI G+ L ++L +PR+P DI ANLNA Y
Sbjct: 163 PPPSR--ETNAMTWSAAFCCAIFWVILILGGLIILIVYLVYRPRSPYVDISAANLNAAYL 220
Query: 113 DSPQYFNGDFTLLANFTNPNTKIDVSFESLDIELFFSDRIISSQSIEPFTQRRRESRLQS 172
D NGD T+LAN TNP+ K V F + EL++ + +I++Q IEPF ++ S +
Sbjct: 221 DMGFLLNGDLTILANVTNPSKKSSVEFSYVTFELYYYNTLIATQYIEPFKVPKKTSMFAN 280
Query: 173 LHFISSLVFLPKDLGVMLEKQVQSNLVNYNVRGTFKVRVTLG-LIHLSYLLHSRCQIEMT 231
+H +SS V L L++Q+++ V N+RG F R +G L SY LH+ C + +
Sbjct: 281 VHLVSSQVQLQATQSRELQRQIETGPVLLNLRGMFHARSHIGPLFRYSYKLHTHCSVSLN 340
Query: 232 SPPTGGLVARKCITKR 247
PP G + AR+C TKR
Sbjct: 341 GPPLGAMRARRCNTKR 356
>AT1G13050.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT3G26350.1); Has 538 Blast hits to 510 proteins
in 88 species: Archae - 0; Bacteria - 23; Metazoa - 81;
Fungi - 36; Plants - 361; Viruses - 8; Other Eukaryotes
- 29 (source: NCBI BLink). | chr1:4450568-4451521
FORWARD LENGTH=317
Length = 317
Score = 124 bits (310), Expect = 7e-29, Method: Compositional matrix adjust.
Identities = 85/235 (36%), Positives = 123/235 (52%), Gaps = 8/235 (3%)
Query: 16 NQTKPLSLDQIVISKQPTNHLSLEPNSSNSAKTKLSRPPALRF--QRTNPIIWFASVLCL 73
N +PL L + P EP A T+ PA + +RT P+ A++ C
Sbjct: 88 NSARPLQLSPEE-QRPPHRGYGSEPTPWRRAPTR----PAYQQGPKRTKPMTLPATICCA 142
Query: 74 IFSLVLIFFGVATLTIFLGIKPRNPTFDIPNANLNALYFDSPQYFNGDFTLLANFTNPNT 133
I +VLI G+ L ++L +PR+P FDI A LN D NGD ++ NFTNP+
Sbjct: 143 ILLIVLILSGLILLLVYLANRPRSPYFDISAATLNTANLDMGYVLNGDLAVVVNFTNPSK 202
Query: 134 KIDVSFESLDIELFFSDRIISSQSIEPFTQRRRESRLQSLHFISSLVFLPKDLGVMLEKQ 193
K V F + EL+F + +I+++ IEPF + S S H +SS V + L+ Q
Sbjct: 203 KSSVDFSYVMFELYFYNTLIATEHIEPFIVPKGMSMFTSFHLVSSQVQIQMIQSQDLQLQ 262
Query: 194 VQSNLVNYNVRGTFKVRVTLG-LIHLSYLLHSRCQIEMTSPPTGGLVARKCITKR 247
+ + V N+RGTF R LG L+ SY LH++C I + +PP G + AR+C TKR
Sbjct: 263 LGTGPVLLNLRGTFHARSNLGSLMRYSYWLHTQCSISLNTPPAGTMRARRCNTKR 317
>AT1G13050.2 | Symbols: | unknown protein; FUNCTIONS IN:
molecular_function unknown; INVOLVED IN:
biological_process unknown; LOCATED IN: endomembrane
system; EXPRESSED IN: 14 plant structures; EXPRESSED
DURING: 9 growth stages; BEST Arabidopsis thaliana
protein match is: unknown protein (TAIR:AT3G26350.1);
Has 260 Blast hits to 259 proteins in 20 species: Archae
- 0; Bacteria - 0; Metazoa - 0; Fungi - 0; Plants - 260;
Viruses - 0; Other Eukaryotes - 0 (source: NCBI BLink).
| chr1:4450964-4451521 FORWARD LENGTH=185
Length = 185
Score = 119 bits (299), Expect = 1e-27, Method: Compositional matrix adjust.
Identities = 71/181 (39%), Positives = 103/181 (56%), Gaps = 1/181 (0%)
Query: 68 ASVLCLIFSLVLIFFGVATLTIFLGIKPRNPTFDIPNANLNALYFDSPQYFNGDFTLLAN 127
A++ C I +VLI G+ L ++L +PR+P FDI A LN D NGD ++ N
Sbjct: 5 ATICCAILLIVLILSGLILLLVYLANRPRSPYFDISAATLNTANLDMGYVLNGDLAVVVN 64
Query: 128 FTNPNTKIDVSFESLDIELFFSDRIISSQSIEPFTQRRRESRLQSLHFISSLVFLPKDLG 187
FTNP+ K V F + EL+F + +I+++ IEPF + S S H +SS V +
Sbjct: 65 FTNPSKKSSVDFSYVMFELYFYNTLIATEHIEPFIVPKGMSMFTSFHLVSSQVQIQMIQS 124
Query: 188 VMLEKQVQSNLVNYNVRGTFKVRVTLG-LIHLSYLLHSRCQIEMTSPPTGGLVARKCITK 246
L+ Q+ + V N+RGTF R LG L+ SY LH++C I + +PP G + AR+C TK
Sbjct: 125 QDLQLQLGTGPVLLNLRGTFHARSNLGSLMRYSYWLHTQCSISLNTPPAGTMRARRCNTK 184
Query: 247 R 247
R
Sbjct: 185 R 185
>AT5G22870.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr5:7647056-7647679 REVERSE LENGTH=207
Length = 207
Score = 56.2 bits (134), Expect = 2e-08, Method: Compositional matrix adjust.
Identities = 41/155 (26%), Positives = 71/155 (45%), Gaps = 3/155 (1%)
Query: 69 SVLCLIF--SLVLIFFG-VATLTIFLGIKPRNPTFDIPNANLNALYFDSPQYFNGDFTLL 125
S++C IF L LIF V L +L KP+ + + NA++ + + + F
Sbjct: 24 SLICYIFLVILTLIFMAAVGFLITWLETKPKKLRYTVENASVQNFNLTNDNHMSATFQFT 83
Query: 126 ANFTNPNTKIDVSFESLDIELFFSDRIISSQSIEPFTQRRRESRLQSLHFISSLVFLPKD 185
NPN +I V + S++I + F D+ ++ ++EPF Q R + I+ V + K
Sbjct: 84 IQSHNPNHRISVYYSSVEIFVKFKDQTLAFDTVEPFHQPRMNVKQIDETLIAENVAVSKS 143
Query: 186 LGVMLEKQVQSNLVNYNVRGTFKVRVTLGLIHLSY 220
G L Q + + V +VR +G+ S+
Sbjct: 144 NGKDLRSQNSLGKIGFEVFVKARVRFKVGIWKSSH 178
>AT5G56070.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT5G56050.1); Has 1807 Blast hits to 1807 proteins
in 277 species: Archae - 0; Bacteria - 0; Metazoa - 736;
Fungi - 347; Plants - 385; Viruses - 0; Other Eukaryotes
- 339 (source: NCBI BLink). | chr5:22708679-22709204
FORWARD LENGTH=119
Length = 119
Score = 56.2 bits (134), Expect = 2e-08, Method: Compositional matrix adjust.
Identities = 22/52 (42%), Positives = 36/52 (69%)
Query: 191 EKQVQSNLVNYNVRGTFKVRVTLGLIHLSYLLHSRCQIEMTSPPTGGLVARK 242
++QV SN++ Y + F+V+V +G I+ SY L CQ+++TSPP L++RK
Sbjct: 47 QRQVTSNMIEYEIISRFRVKVVIGYINYSYWLKGSCQLQLTSPPADDLLSRK 98
>AT3G11650.1 | Symbols: NHL2 | NDR1/HIN1-like 2 |
chr3:3676264-3676986 REVERSE LENGTH=240
Length = 240
Score = 52.8 bits (125), Expect = 2e-07, Method: Compositional matrix adjust.
Identities = 43/171 (25%), Positives = 75/171 (43%), Gaps = 6/171 (3%)
Query: 69 SVLCLIFSLVLIFFGVATLTIFLGIKPRNPTFDIPNANLNALYFDSPQYFNGDFTLLANF 128
S++C I V + GVA L ++L +P F + +ANLN FD N ++L NF
Sbjct: 53 SLICNILIAVAVILGVAALILWLIFRPNAVKFYVADANLNRFSFDPNN--NLHYSLDLNF 110
Query: 129 T--NPNTKIDVSFESLDIELFFSDRIISSQSIEPFTQRRRESRLQSLHFIS-SLVFLPKD 185
T NPN ++ V ++ + ++ D+ S ++ F Q + + + +LV L
Sbjct: 111 TIRNPNQRVGVYYDEFSVSGYYGDQRFGSANVSSFYQGHKNTTVILTKIEGQNLVVLGDG 170
Query: 186 LGVMLEKQVQSNLVNYNVRGTFKVRVTLGLIHLSYLLHSRCQIEMTSPPTG 236
L+ +S + N + VR I S+ L + + + P G
Sbjct: 171 ARTDLKDDEKSGIYRINAKLRLSVRFKFWFIK-SWKLKPKIKCDDLKIPLG 220
>AT2G35460.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr2:14905788-14906504 FORWARD LENGTH=238
Length = 238
Score = 52.4 bits (124), Expect = 3e-07, Method: Compositional matrix adjust.
Identities = 36/168 (21%), Positives = 79/168 (47%), Gaps = 5/168 (2%)
Query: 69 SVLCLIFSLVLIFFGVATLTIFLGIKPRNPTFDIPNANLNALYFDSPQYFNGDFTLLANF 128
+++C I VL+ GV L ++ ++P F + A+L FD P+ N + + NF
Sbjct: 52 NIICNILIGVLVCLGVVALILWFILRPNVVKFQVTEADLTRFEFD-PRSHNLHYNISLNF 110
Query: 129 T--NPNTKIDVSFESLDIELFFSDRIISSQSIEPFTQRRRESRLQSLHFIS-SLVFLPKD 185
+ NPN ++ + ++ L++ ++ D+ S+ ++ F Q + + + LV L
Sbjct: 111 SIRNPNQRLGIHYDQLEVRGYYGDQRFSAANMTSFYQGHKNTTVVGTELNGQKLVLLGAG 170
Query: 186 LGVMLEKQVQSNLVNYNVRGTFKVRVTLGLIHLSYLLHSRCQIEMTSP 233
+ +S + +V+ FK+R G ++ S+ + + + + P
Sbjct: 171 GRRDFREDRRSGVYRIDVKLRFKLRFKFGFLN-SWAVRPKIKCHLKVP 217
>AT3G52470.1 | Symbols: | Late embryogenesis abundant (LEA)
hydroxyproline-rich glycoprotein family |
chr3:19450750-19451376 FORWARD LENGTH=208
Length = 208
Score = 49.7 bits (117), Expect = 2e-06, Method: Compositional matrix adjust.
Identities = 44/181 (24%), Positives = 77/181 (42%), Gaps = 10/181 (5%)
Query: 71 LCLIFSLVLIFFGVATLTIFLG---IKPRNPTFDIPNANLNALYFDSPQYFNGDFTLLAN 127
LC + ++ F + +TIFL ++P P F + +A + A P +F +
Sbjct: 19 LC---AAIIAFIVIVLITIFLVWVILRPTKPRFVLQDATVYAFNLSQPNLLTSNFQVTIA 75
Query: 128 FTNPNTKIDVSFESLDI-ELFFSDRIISSQSIEPFTQRRRESRLQSLHFISSLVFLPKDL 186
NPN+KI + ++ L + + + +I +I P Q +E + S + V +
Sbjct: 76 SRNPNSKIGIYYDRLHVYATYMNQQITLRTAIPPTYQGHKEVNVWSPFVYGTAVPIAPYN 135
Query: 187 GVMLEKQVQSNLVNYNVRGTFKVRVTL-GLIHLSYLLHSRCQ--IEMTSPPTGGLVARKC 243
V L ++ V +R VR + LI Y +H RCQ I + + G LV
Sbjct: 136 SVALGEEKDRGFVGLMIRADGTVRWKVRTLITGKYHIHVRCQAFINLGNKAAGVLVGDNA 195
Query: 244 I 244
+
Sbjct: 196 V 196
>AT3G11660.1 | Symbols: NHL1 | NDR1/HIN1-like 1 |
chr3:3679031-3679660 REVERSE LENGTH=209
Length = 209
Score = 47.4 bits (111), Expect = 8e-06, Method: Compositional matrix adjust.
Identities = 42/158 (26%), Positives = 72/158 (45%), Gaps = 6/158 (3%)
Query: 73 LIFSLVLIFFGVATLTIFLGIKPRNPTFDIPNANLNALYFDS--PQYFNGDFTLLANFTN 130
+IF L +IF + L I+ ++P P F + +A + A P +F + + N
Sbjct: 22 IIFVLFIIFLTI--LLIWAILQPSKPRFILQDATVYAFNVSGNPPNLLTSNFQITLSSRN 79
Query: 131 PNTKIDVSFESLDI-ELFFSDRIISSQSIEPFTQRRRESRLQSLHFISSLVFLPKDLGVM 189
PN KI + ++ LD+ + S +I SI P Q ++ + S + V + GV
Sbjct: 80 PNNKIGIYYDRLDVYATYRSQQITFPTSIPPTYQGHKDVDIWSPFVYGTSVPIAPFNGVS 139
Query: 190 LEKQVQSNLVNYNVRGTFKVRVTLG-LIHLSYLLHSRC 226
L+ + +V +R +VR +G I Y LH +C
Sbjct: 140 LDTDKDNGVVLLIIRADGRVRWKVGTFITGKYHLHVKC 177