Miyakogusa Predicted Gene
- Lj1g3v3881740.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj1g3v3881740.1 Non Chatacterized Hit- tr|B9S6J5|B9S6J5_RICCO
Putative uncharacterized protein OS=Ricinus communis
G,65.57,0.0000000000009,coiled-coil,NULL; SUBFAMILY NOT NAMED,NULL;
FAMILY NOT NAMED,NULL; seg,NULL,gene.g35514.t1.1
(314 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT1G07120.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 272 2e-73
AT4G18570.1 | Symbols: | Tetratricopeptide repeat (TPR)-like su... 167 7e-42
AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 140 9e-34
AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 140 1e-33
AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein ... 140 1e-33
AT1G48280.1 | Symbols: | hydroxyproline-rich glycoprotein famil... 126 2e-29
AT4G04980.1 | Symbols: | unknown protein; BEST Arabidopsis thal... 53 3e-07
AT1G61080.1 | Symbols: | Hydroxyproline-rich glycoprotein famil... 49 6e-06
>AT1G07120.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
chloroplast envelope; EXPRESSED IN: inflorescence
meristem, petal, leaf whorl, flower; EXPRESSED DURING: 4
anthesis, petal differentiation and expansion stage;
BEST Arabidopsis thaliana protein match is:
Tetratricopeptide repeat (TPR)-like superfamily protein
(TAIR:AT4G18570.1); Has 288 Blast hits to 260 proteins
in 50 species: Archae - 0; Bacteria - 8; Metazoa - 27;
Fungi - 15; Plants - 163; Viruses - 0; Other Eukaryotes
- 75 (source: NCBI BLink). | chr1:2184874-2186580
REVERSE LENGTH=392
Length = 392
Score = 272 bits (696), Expect = 2e-73, Method: Compositional matrix adjust.
Identities = 145/303 (47%), Positives = 199/303 (65%), Gaps = 25/303 (8%)
Query: 5 ENESEITYLKKKVELQMARNESLQRENQELREEVARLKSQIISFKAHDMERKSILWKKIQ 64
E++S++ L K+++ + RN+ L++EN ELR+EVARL++Q+ + K+H+ ERKS+LWKK+Q
Sbjct: 6 EDDSDLLRLVKELQAYLVRNDKLEKENHELRQEVARLRAQVSNLKSHENERKSMLWKKLQ 65
Query: 65 KSIDGNNADCEKSSENGNVHSNPGFQDSAXXXXXXXXXXXXXXXXXSSNLLPSHKNERGI 124
S DG+N D +V SN Q+ + N P+ I
Sbjct: 66 SSYDGSNTDGSNLKAPESVKSNTKGQE-----------------VRNPNPKPT------I 102
Query: 125 KMQQTIAXXXXXXX--XXXXFGLKAVRRVPEVIELYRSLTRKDVATENRIHQNGIPVVAF 182
+ Q T G ++VRR PEV+E YR+LT+++ N+I+QNG+ AF
Sbjct: 103 QGQSTATKPPPPPPLPSKRTLGKRSVRRAPEVVEFYRALTKRESHMGNKINQNGVLSPAF 162
Query: 183 TRNMIEEIENRSTYLTAIKSEVQRQGEFISFLIKEVEAASFADISEVETFTKFLDGELSS 242
RNMI EIENRS YL+ IKS+ R + I LI +VEAA+F DISEVETF K++D ELSS
Sbjct: 163 NRNMIGEIENRSKYLSDIKSDTDRHRDHIHILISKVEAATFTDISEVETFVKWIDEELSS 222
Query: 243 LVDERSVLKHFPQWPEQKVDALREAACNYRDLKNLEAEVSSYEDNPKEPLAQALKRIQAL 302
LVDER+VLKHFP+WPE+KVD+LREAACNY+ KNL E+ S++DNPK+ L QAL+RIQ+L
Sbjct: 223 LVDERAVLKHFPKWPERKVDSLREAACNYKRPKNLGNEILSFKDNPKDSLTQALQRIQSL 282
Query: 303 QDR 305
QDR
Sbjct: 283 QDR 285
>AT4G18570.1 | Symbols: | Tetratricopeptide repeat (TPR)-like
superfamily protein | chr4:10231439-10234534 FORWARD
LENGTH=642
Length = 642
Score = 167 bits (424), Expect = 7e-42, Method: Compositional matrix adjust.
Identities = 89/162 (54%), Positives = 112/162 (69%), Gaps = 8/162 (4%)
Query: 148 VRRVPEVIELYRSLTRKDVATENRIHQNG-------IPVVAFTRNMIEEIENRSTYLTAI 200
VRRVPEV+E Y SL R+D R G I + R+MI EIENRS YL AI
Sbjct: 353 VRRVPEVVEFYHSLMRRDSTNSRRDSTGGGNAAAEAILANSNARDMIGEIENRSVYLLAI 412
Query: 201 KSEVQRQGEFISFLIKEVEAASFADISEVETFTKFLDGELSSLVDERSVLKHFPQWPEQK 260
K++V+ QG+FI FLIKEV A+F+DI +V F K+LD ELS LVDER+VLKHF +WPEQK
Sbjct: 413 KTDVETQGDFIRFLIKEVGNAAFSDIEDVVPFVKWLDDELSYLVDERAVLKHF-EWPEQK 471
Query: 261 VDALREAACNYRDLKNLEAEVSSYEDNPKEPLAQALKRIQAL 302
DALREAA Y DLK L +E S + ++P++ + ALK++QAL
Sbjct: 472 ADALREAAFCYFDLKKLISEASRFREDPRQSSSSALKKMQAL 513
>AT3G25690.3 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD LENGTH=863
Length = 863
Score = 140 bits (354), Expect = 9e-34, Method: Compositional matrix adjust.
Identities = 76/158 (48%), Positives = 104/158 (65%), Gaps = 4/158 (2%)
Query: 148 VRRVPEVIELYRSLTRKDVATEN--RIHQNGIPVVAFTRN-MIEEIENRSTYLTAIKSEV 204
V R PE++E Y+SL +++ E + +G + RN MI EIENRST+L A+K++V
Sbjct: 577 VHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADV 636
Query: 205 QRQGEFISFLIKEVEAASFADISEVETFTKFLDGELSSLVDERSVLKHFPQWPEQKVDAL 264
+ QG+F+ L EV A+SF DI ++ F +LD ELS LVDER+VLKHF WPE K DAL
Sbjct: 637 ETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHF-DWPEGKADAL 695
Query: 265 REAACNYRDLKNLEAEVSSYEDNPKEPLAQALKRIQAL 302
REAA Y+DL LE +V+S+ D+P ALK++ L
Sbjct: 696 REAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKL 733
>AT3G25690.2 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD
LENGTH=1004
Length = 1004
Score = 140 bits (353), Expect = 1e-33, Method: Compositional matrix adjust.
Identities = 76/158 (48%), Positives = 104/158 (65%), Gaps = 4/158 (2%)
Query: 148 VRRVPEVIELYRSLTRKDVATEN--RIHQNGIPVVAFTRN-MIEEIENRSTYLTAIKSEV 204
V R PE++E Y+SL +++ E + +G + RN MI EIENRST+L A+K++V
Sbjct: 718 VHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADV 777
Query: 205 QRQGEFISFLIKEVEAASFADISEVETFTKFLDGELSSLVDERSVLKHFPQWPEQKVDAL 264
+ QG+F+ L EV A+SF DI ++ F +LD ELS LVDER+VLKHF WPE K DAL
Sbjct: 778 ETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHF-DWPEGKADAL 836
Query: 265 REAACNYRDLKNLEAEVSSYEDNPKEPLAQALKRIQAL 302
REAA Y+DL LE +V+S+ D+P ALK++ L
Sbjct: 837 REAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKL 874
>AT3G25690.1 | Symbols: CHUP1 | Hydroxyproline-rich glycoprotein
family protein | chr3:9354061-9357757 FORWARD
LENGTH=1004
Length = 1004
Score = 140 bits (353), Expect = 1e-33, Method: Compositional matrix adjust.
Identities = 76/158 (48%), Positives = 104/158 (65%), Gaps = 4/158 (2%)
Query: 148 VRRVPEVIELYRSLTRKDVATEN--RIHQNGIPVVAFTRN-MIEEIENRSTYLTAIKSEV 204
V R PE++E Y+SL +++ E + +G + RN MI EIENRST+L A+K++V
Sbjct: 718 VHRAPELVEFYQSLMKRESKKEGAPSLISSGTGNSSAARNNMIGEIENRSTFLLAVKADV 777
Query: 205 QRQGEFISFLIKEVEAASFADISEVETFTKFLDGELSSLVDERSVLKHFPQWPEQKVDAL 264
+ QG+F+ L EV A+SF DI ++ F +LD ELS LVDER+VLKHF WPE K DAL
Sbjct: 778 ETQGDFVQSLATEVRASSFTDIEDLLAFVSWLDEELSFLVDERAVLKHF-DWPEGKADAL 836
Query: 265 REAACNYRDLKNLEAEVSSYEDNPKEPLAQALKRIQAL 302
REAA Y+DL LE +V+S+ D+P ALK++ L
Sbjct: 837 REAAFEYQDLMKLEKQVTSFVDDPNLSCEPALKKMYKL 874
>AT1G48280.1 | Symbols: | hydroxyproline-rich glycoprotein family
protein | chr1:17835196-17837553 FORWARD LENGTH=558
Length = 558
Score = 126 bits (316), Expect = 2e-29, Method: Compositional matrix adjust.
Identities = 69/164 (42%), Positives = 108/164 (65%), Gaps = 5/164 (3%)
Query: 146 KAVR--RVPEVIELYRSLTRKDVA--TENRIHQNGIPVVAFTRNMIEEIENRSTYLTAIK 201
KA R + P V +L++ L ++D + ++ N V + +++ EI+NRS +L AIK
Sbjct: 275 KAARAQKSPPVSQLFQLLNKQDNSRNLSQSVNGNKSQVNSAHNSIVGEIQNRSAHLIAIK 334
Query: 202 SEVQRQGEFISFLIKEVEAASFADISEVETFTKFLDGELSSLVDERSVLKHFPQWPEQKV 261
++++ +GEFI+ LI++V F+D+ +V F +LD EL++L DER+VLKHF +WPE+K
Sbjct: 335 ADIETKGEFINDLIQKVLTTCFSDMEDVMKFVDWLDKELATLADERAVLKHF-KWPEKKA 393
Query: 262 DALREAACNYRDLKNLEAEVSSYEDNPKEPLAQALKRIQALQDR 305
D L+EAA YR+LK LE E+SSY D+P ALK++ L D+
Sbjct: 394 DTLQEAAVEYRELKKLEKELSSYSDDPNIHYGVALKKMANLLDK 437
>AT4G04980.1 | Symbols: | unknown protein; BEST Arabidopsis
thaliana protein match is: unknown protein
(TAIR:AT1G11070.1); Has 23100 Blast hits to 15699
proteins in 1063 species: Archae - 116; Bacteria - 2262;
Metazoa - 8308; Fungi - 3268; Plants - 3181; Viruses -
958; Other Eukaryotes - 5007 (source: NCBI BLink). |
chr4:2544210-2547893 REVERSE LENGTH=880
Length = 880
Score = 52.8 bits (125), Expect = 3e-07, Method: Compositional matrix adjust.
Identities = 43/168 (25%), Positives = 77/168 (45%), Gaps = 18/168 (10%)
Query: 148 VRRVPEVIELYRSL------------TRKDVATENRIHQNGIPVVAFTRNM---IEEIEN 192
+RR ++ LY +L T+K +N + + PV M + E+
Sbjct: 590 LRRSAQIANLYWALKGKLEGRGVEGKTKKASKGQNSVAEKS-PVKVARSGMADALAEMTK 648
Query: 193 RSTYLTAIKSEVQRQGEFISFLIKEVEAASFADISEVETFTKFLDGELSSLVDERSVLKH 252
RS+Y I+ +VQ+ + I L + + D+ E+ F ++ L L DE VL
Sbjct: 649 RSSYFQQIEEDVQKYAKSIEELKSSIHSFQTKDMKELLEFHSKVESILEKLTDETQVLAR 708
Query: 253 FPQWPEQKVDALREAACNYRDLKNLEAEVSSYEDNPKEPLAQALKRIQ 300
F +PE+K++ +R A Y+ L + E+ +++ P PL L +I+
Sbjct: 709 FEGFPEKKLEVIRTAGALYKKLDGILVELKNWKIEP--PLNDLLDKIE 754
>AT1G61080.1 | Symbols: | Hydroxyproline-rich glycoprotein family
protein | chr1:22493194-22497019 REVERSE LENGTH=907
Length = 907
Score = 48.5 bits (114), Expect = 6e-06, Method: Compositional matrix adjust.
Identities = 34/126 (26%), Positives = 65/126 (51%), Gaps = 7/126 (5%)
Query: 187 IEEIENRSTYLTAIKSEVQRQGEFISFLIKEVEAASFADISEVETFTKFLDGELSSLVDE 246
+ EI +S Y I++++ + I+ L E+ D++E+ +F + ++ L +L DE
Sbjct: 707 LAEITKKSAYFLQIQADIAKYMTSINELKIEITKFQTKDMTELLSFHRRVESVLENLTDE 766
Query: 247 RSVLKHFPQWPEQKVDALREAACNYRDLKNLEAEVSSYEDNPKEPLAQALKRIQALQDRR 306
VL +P++K++A+R A Y L + E+ + + P PL Q L +++ R
Sbjct: 767 SQVLARCEGFPQKKLEAMRMAVALYTKLHGMITELQNMKIEP--PLNQLLDKVE-----R 819
Query: 307 ACTKIR 312
TKI+
Sbjct: 820 YFTKIK 825