Miyakogusa Predicted Gene
- Lj1g3v3892320.1
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj1g3v3892320.1 tr|A9SWX5|A9SWX5_PHYPA Predicted protein
OS=Physcomitrella patens subsp. patens
GN=PHYPADRAFT_234309,40,0.000000000000007,FAMILY NOT NAMED,NULL;
seg,NULL; BAR/IMD domain-like,NULL,CUFF.31391.1
(497 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT2G33490.1 | Symbols: | hydroxyproline-rich glycoprotein famil... 273 2e-73
AT5G41100.1 | Symbols: | FUNCTIONS IN: molecular_function unkno... 237 1e-62
AT5G41100.2 | Symbols: | FUNCTIONS IN: molecular_function unkno... 237 2e-62
AT3G26910.2 | Symbols: | hydroxyproline-rich glycoprotein famil... 231 1e-60
AT3G26910.1 | Symbols: | hydroxyproline-rich glycoprotein famil... 231 1e-60
>AT2G33490.1 | Symbols: | hydroxyproline-rich glycoprotein family
protein | chr2:14183552-14187666 FORWARD LENGTH=623
Length = 623
Score = 273 bits (699), Expect = 2e-73, Method: Compositional matrix adjust.
Identities = 202/497 (40%), Positives = 277/497 (55%), Gaps = 50/497 (10%)
Query: 1 MKRQCDEKRDVYEYMIAQQXXXXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSLK 60
M+R CDEKR+VYE M+ +Q + QQLQ AHD+Y+ E TL FRLKSLK
Sbjct: 132 MQRLCDEKRNVYEGMLTRQREKGRSKGGKGETFSPQQLQEAHDDYENETTLFVFRLKSLK 191
Query: 61 QGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFSGLEDDDGEDC 120
QGQ+RSLLTQAARH+AAQL FF+K L SLE V+PHV+MV QHIDY FSGLEDDDG+D
Sbjct: 192 QGQTRSLLTQAARHHAAQLCFFKKALSSLEEVDPHVQMVTESQHIDYHFSGLEDDDGDDE 251
Query: 121 SEDAGND-DEIVEGTELSFNYRSNKQGPYTASTSPNSAEVEESRLSYVRASTAETAEIDK 179
E+ ND E+ + ELSF YR N + S++ S+E+ S +++ + TA+ +
Sbjct: 252 IENNENDGSEVHDDGELSFEYRVNDKDQDADSSAGGSSELGNSDITFPQIGGPYTAQ-EN 310
Query: 180 NQGDFKFS---TRDRRVSSYSAPIFAEKKFD-PAEKVRLLLSSSAAKPNAYVLPTPVNIK 235
+G+++ S RD R S SAP+F E + P+EK+ + S+ K N Y LPTPV
Sbjct: 311 EEGNYRKSHSFRRDVRAVSQSAPLFPENRTTPPSEKLLRMRSTLTRKFNTYALPTPVETT 370
Query: 236 ETKTXXXXXXXXXXXXHN--------LWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILK 287
+ + N +W+SSPL+ + K S + +L+
Sbjct: 371 RSPSSTTSPGHKNVGSSNPTKAITKQIWYSSPLETRGPAK-----VSSRSMVALKEQVLR 425
Query: 288 ESNSDNTSIQLPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPV------ 341
ESN NTS +LP P A+GL + R++FSGPLT+KPL KP+
Sbjct: 426 ESNK-NTS-RLPPPLADGLLFSRLGTLK--------RRSFSGPLTSKPLPNKPLSTTSHL 475
Query: 342 -SGAFPRLPMPQPTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKSSRVGHSAP 400
SG PR P+ + + +ISELHELPRPP S+K S +G+SAP
Sbjct: 476 YSGPIPRNPVSKLPKVSSSPTASPTFVSTPKISELHELPRPPPRSSTK--SSRELGYSAP 533
Query: 401 LVFRNPDHPAANKFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTKYLDTRQ 460
LV R + ++++ ASPLP PP ++RSFSIP+SN RA L+++ T L T++
Sbjct: 534 LVSR-----SQLLSKPLITNSASPLPIPP-AITRSFSIPTSNLRASDLDMSKTS-LGTKK 586
Query: 461 IPEKVEVAASPPLTPIS 477
+ SPPLTP+S
Sbjct: 587 L-----GTPSPPLTPMS 598
>AT5G41100.1 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
plasma membrane; EXPRESSED IN: 23 plant structures;
EXPRESSED DURING: 13 growth stages; BEST Arabidopsis
thaliana protein match is: hydroxyproline-rich
glycoprotein family protein (TAIR:AT3G26910.2); Has 1503
Blast hits to 1197 proteins in 220 species: Archae - 4;
Bacteria - 108; Metazoa - 481; Fungi - 318; Plants -
186; Viruses - 39; Other Eukaryotes - 367 (source: NCBI
BLink). | chr5:16447429-16450610 FORWARD LENGTH=586
Length = 586
Score = 237 bits (605), Expect = 1e-62, Method: Compositional matrix adjust.
Identities = 200/492 (40%), Positives = 267/492 (54%), Gaps = 86/492 (17%)
Query: 1 MKRQCDEKRDVYEYMIAQQXX-XXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSL 59
MK+QC+EKRDV ++M+ + + +QL+ A DE ++EATLC FRLKSL
Sbjct: 133 MKQQCEEKRDVVKHMLMEHVKDKVQVKGTKGERLIRRQLETARDELQDEATLCIFRLKSL 192
Query: 60 KQGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFSGLEDDDGED 119
K+GQ+RSLLTQAARH+ AQ++ F GLKSLEAVE HVR+ A QHID S + + D
Sbjct: 193 KEGQARSLLTQAARHHTAQMHMFFAGLKSLEAVEQHVRIAADRQHIDCVLS--DPGNEMD 250
Query: 120 CSEDAGNDDEIV-EGTELSFNYRSNKQGPYTASTSPNSAEVEESRLSYVRASTAETAEID 178
CSED +DD +V ELSF+Y +++Q ST S +++++ LS+ R S A +A ++
Sbjct: 251 CSEDNDDDDRLVNRDGELSFDYITSEQRVEVISTPHGSMKMDDTDLSFQRPSPAGSATVN 310
Query: 179 KN-QGDFKFSTRDRRVSSYSAPIFAEKKFDPAEKVRLLLSSSAAKPNAYVLPTPVNIKET 237
+ + + S RDRR SS+SAP+F +KK D A++ ++ SA NAY+LPTPV+ K +
Sbjct: 311 ADPREEHSVSNRDRRTSSHSAPLFPDKKADLADRSMRQMTPSA---NAYILPTPVDSKSS 367
Query: 238 KTXXXXXXXXXXXXHNLWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILKESNSDNTSIQ 297
NLWHSSPL EP I AH K++ S N +
Sbjct: 368 PIFTKPVTQTNHSA-NLWHSSPL---------------EP-IKTAH---KDAES-NLYSR 406
Query: 298 LPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPVSGAFPRLPMP-----Q 352
LPRPS AFSGPL KP S RLP+P Q
Sbjct: 407 LPRPS---------------------EHAFSGPL--KPSST--------RLPVPVAVQAQ 435
Query: 353 PTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKS---SRVGHSAPLVFRNPDHP 409
+SP+ RI+ELHELPRPPG Q + P +S VGHSAPL N +
Sbjct: 436 SSSPRISPTASPPLASSPRINELHELPRPPG-QFAPPRRSKSPGLVGHSAPLTAWNQERS 494
Query: 410 AANKFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTKYLDTRQIPEKVE--V 467
++V ASPLP PP++V RS+SIPS NQRAMA + +PE+ + V
Sbjct: 495 NVVVSTNIV---ASPLPVPPLVVPRSYSIPSRNQRAMA----------QQPLPERNQNRV 541
Query: 468 AASP--PLTPIS 477
A+ P PLTP S
Sbjct: 542 ASPPPLPLTPAS 553
>AT5G41100.2 | Symbols: | FUNCTIONS IN: molecular_function unknown;
INVOLVED IN: biological_process unknown; LOCATED IN:
plasma membrane; EXPRESSED IN: 23 plant structures;
EXPRESSED DURING: 13 growth stages; BEST Arabidopsis
thaliana protein match is: hydroxyproline-rich
glycoprotein family protein (TAIR:AT3G26910.2); Has 1497
Blast hits to 1191 proteins in 214 species: Archae - 4;
Bacteria - 102; Metazoa - 485; Fungi - 316; Plants -
187; Viruses - 37; Other Eukaryotes - 366 (source: NCBI
BLink). | chr5:16447429-16450686 FORWARD LENGTH=582
Length = 582
Score = 237 bits (604), Expect = 2e-62, Method: Compositional matrix adjust.
Identities = 202/494 (40%), Positives = 267/494 (54%), Gaps = 90/494 (18%)
Query: 1 MKRQCDEKRDVYEYMIAQQXX-XXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSL 59
MK+QC+EKRDV ++M+ + + +QL+ A DE ++EATLC FRLKSL
Sbjct: 133 MKQQCEEKRDVVKHMLMEHVKDKVQVKGTKGERLIRRQLETARDELQDEATLCIFRLKSL 192
Query: 60 KQGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFSGLEDDDGE- 118
K+GQ+RSLLTQAARH+ AQ++ F GLKSLEAVE HVR+ A QHID S D G
Sbjct: 193 KEGQARSLLTQAARHHTAQMHMFFAGLKSLEAVEQHVRIAADRQHIDCVLS----DPGNE 248
Query: 119 -DCSEDAGNDDEIV-EGTELSFNYRSNKQGPYTASTSPNSAEVEESRLSYVRASTAETAE 176
DCSED +DD +V ELSF+Y +++Q ST S +++++ LS+ R S A +A
Sbjct: 249 MDCSEDNDDDDRLVNRDGELSFDYITSEQRVEVISTPHGSMKMDDTDLSFQRPSPAGSAT 308
Query: 177 IDKN-QGDFKFSTRDRRVSSYSAPIFAEKKFDPAEKVRLLLSSSAAKPNAYVLPTPVNIK 235
++ + + + S RDRR SS+SAP+F +KK D A++ ++ SA NAY+LPTPV+ K
Sbjct: 309 VNADPREEHSVSNRDRRTSSHSAPLFPDKKADLADRSMRQMTPSA---NAYILPTPVDSK 365
Query: 236 ETKTXXXXXXXXXXXXHNLWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILKESNSDNTS 295
+ NLWHSSPL EP I AH K++ S N
Sbjct: 366 SSPIFTKPVTQTNHSA-NLWHSSPL---------------EP-IKTAH---KDAES-NLY 404
Query: 296 IQLPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPVSGAFPRLPMP---- 351
+LPRPS AFSGPL KP S RLP+P
Sbjct: 405 SRLPRPS---------------------EHAFSGPL--KPSST--------RLPVPVAVQ 433
Query: 352 -QPTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKS---SRVGHSAPLVFRNPD 407
Q +SP+ RI+ELHELPRPPG Q + P +S VGHSAPL N +
Sbjct: 434 AQSSSPRISPTASPPLASSPRINELHELPRPPG-QFAPPRRSKSPGLVGHSAPLTAWNQE 492
Query: 408 HPAANKFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTKYLDTRQIPEKVE- 466
++V ASPLP PP++V RS+SIPS NQRAMA + +PE+ +
Sbjct: 493 RSNVVVSTNIV---ASPLPVPPLVVPRSYSIPSRNQRAMA----------QQPLPERNQN 539
Query: 467 -VAASP--PLTPIS 477
VA+ P PLTP S
Sbjct: 540 RVASPPPLPLTPAS 553
>AT3G26910.2 | Symbols: | hydroxyproline-rich glycoprotein family
protein | chr3:9915304-9918511 REVERSE LENGTH=614
Length = 614
Score = 231 bits (589), Expect = 1e-60, Method: Compositional matrix adjust.
Identities = 192/503 (38%), Positives = 266/503 (52%), Gaps = 74/503 (14%)
Query: 1 MKRQCDEKRDVYEYMIAQQXXXXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSLK 60
MK+QCD KR+VYE + ++ + + A+ E+ +EAT+C FRLKSLK
Sbjct: 134 MKQQCDGKRNVYEMSLVKEKGRPKSSKGERH--IPPESRPAYSEFHDEATMCIFRLKSLK 191
Query: 61 QGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFS--GLEDDDGE 118
+GQ+RSLL QA RH+ AQ+ F GLKSLEAVE HV++ QHID S G E + E
Sbjct: 192 EGQARSLLIQAVRHHTAQMRLFHTGLKSLEAVERHVKVAVEKQHIDCDLSVHGNEMEASE 251
Query: 119 DCSEDAGNDDEIVEGTELSFNYRSNKQGPYTASTS-PNSAEVEESRLSYVRASTAETAEI 177
D +D + EG ELSF+YR+N+Q +S S P + +++++ LS+ R ST A +
Sbjct: 252 DDDDDGRYMNR--EG-ELSFDYRTNEQKVEASSLSTPWATKMDDTDLSFPRPSTTRPAAV 308
Query: 178 DKN-QGDFKFSTRDRRVSSYSAPIFAEKKFDPAEKVRLLLSSSAAKP--NAYVLPTPVNI 234
+ + + ++ STRD+ +SS+SAP+F EKK D +E++R A P NAYVLPTP +
Sbjct: 309 NADHREEYPVSTRDKYLSSHSAPLFPEKKPDVSERLR------QANPSFNAYVLPTPNDS 362
Query: 235 KETK--TXXXXXXXXXXXXHNLWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILKESNSD 292
+ +K + N+WHSSPL+ K+ KD D ESNS
Sbjct: 363 RYSKPVSQALNPRPTNHSAGNIWHSSPLEPIKSGKDGKDA---------------ESNSF 407
Query: 293 NTSIQLPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPV------SGAFP 346
+LPRPS Q + R AFSGPL +P S KP+ SGAF
Sbjct: 408 YG--RLPRPSTTDTHHHQ--------QQAAGRHAFSGPL--RPSSTKPITMADSYSGAFC 455
Query: 347 RLPMP--------QPTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKS---SRV 395
LP P +SP+ R++ELHELPRPPG + P ++ V
Sbjct: 456 PLPTPPVLQSHPHSSSSPRVSPTASPPPASSPRLNELHELPRPPGHFAPPPRRAKSPGLV 515
Query: 396 GHSAPLVFRNPDHPAAN-KFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTK 454
GHSAPL N + PS + ASPLP PP++V RS+SIPS NQR ++
Sbjct: 516 GHSAPLTAWNQERSTVTVAVPSATNIVASPLPVPPLVVPRSYSIPSRNQRVVS------- 568
Query: 455 YLDTRQIPEKVEVAASPPLTPIS 477
R + + ++ ASPPLTP+S
Sbjct: 569 ---QRLVERRDDIVASPPLTPMS 588
>AT3G26910.1 | Symbols: | hydroxyproline-rich glycoprotein family
protein | chr3:9915338-9918511 REVERSE LENGTH=608
Length = 608
Score = 231 bits (588), Expect = 1e-60, Method: Compositional matrix adjust.
Identities = 192/503 (38%), Positives = 266/503 (52%), Gaps = 74/503 (14%)
Query: 1 MKRQCDEKRDVYEYMIAQQXXXXXXXXXXXXXITLQQLQAAHDEYKEEATLCAFRLKSLK 60
MK+QCD KR+VYE + ++ + + A+ E+ +EAT+C FRLKSLK
Sbjct: 134 MKQQCDGKRNVYEMSLVKEKGRPKSSKGERH--IPPESRPAYSEFHDEATMCIFRLKSLK 191
Query: 61 QGQSRSLLTQAARHYAAQLNFFRKGLKSLEAVEPHVRMVAGHQHIDYQFS--GLEDDDGE 118
+GQ+RSLL QA RH+ AQ+ F GLKSLEAVE HV++ QHID S G E + E
Sbjct: 192 EGQARSLLIQAVRHHTAQMRLFHTGLKSLEAVERHVKVAVEKQHIDCDLSVHGNEMEASE 251
Query: 119 DCSEDAGNDDEIVEGTELSFNYRSNKQGPYTASTS-PNSAEVEESRLSYVRASTAETAEI 177
D +D + EG ELSF+YR+N+Q +S S P + +++++ LS+ R ST A +
Sbjct: 252 DDDDDGRYMNR--EG-ELSFDYRTNEQKVEASSLSTPWATKMDDTDLSFPRPSTTRPAAV 308
Query: 178 DKN-QGDFKFSTRDRRVSSYSAPIFAEKKFDPAEKVRLLLSSSAAKP--NAYVLPTPVNI 234
+ + + ++ STRD+ +SS+SAP+F EKK D +E++R A P NAYVLPTP +
Sbjct: 309 NADHREEYPVSTRDKYLSSHSAPLFPEKKPDVSERLR------QANPSFNAYVLPTPNDS 362
Query: 235 KETK--TXXXXXXXXXXXXHNLWHSSPLDEKKNEKDLVDGKLSEPAIPRAHSILKESNSD 292
+ +K + N+WHSSPL+ K+ KD D ESNS
Sbjct: 363 RYSKPVSQALNPRPTNHSAGNIWHSSPLEPIKSGKDGKDA---------------ESNSF 407
Query: 293 NTSIQLPRPSAEGLSLPQFDIFNASDSKKTLRQAFSGPLTNKPLSVKPV------SGAFP 346
+LPRPS Q + R AFSGPL +P S KP+ SGAF
Sbjct: 408 YG--RLPRPSTTDTHHHQ--------QQAAGRHAFSGPL--RPSSTKPITMADSYSGAFC 455
Query: 347 RLPMP--------QPTSPKAXXXXXXXXXXXXRISELHELPRPPGDQSSKPMKSSR---V 395
LP P +SP+ R++ELHELPRPPG + P ++ V
Sbjct: 456 PLPTPPVLQSHPHSSSSPRVSPTASPPPASSPRLNELHELPRPPGHFAPPPRRAKSPGLV 515
Query: 396 GHSAPLVFRNPDHPAAN-KFPSVVSSGASPLPTPPIMVSRSFSIPSSNQRAMALNVTNTK 454
GHSAPL N + PS + ASPLP PP++V RS+SIPS NQR ++
Sbjct: 516 GHSAPLTAWNQERSTVTVAVPSATNIVASPLPVPPLVVPRSYSIPSRNQRVVS------- 568
Query: 455 YLDTRQIPEKVEVAASPPLTPIS 477
R + + ++ ASPPLTP+S
Sbjct: 569 ---QRLVERRDDIVASPPLTPMS 588