Miyakogusa Predicted Gene
- Lj2g3v0486970.2
BLASTP 2.2.25 [Feb-01-2011]
Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer,
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997),
"Gapped BLAST and PSI-BLAST: a new generation of protein database search
programs", Nucleic Acids Res. 25:3389-3402.
Reference for compositional score matrix adjustment: Altschul, Stephen F.,
John C. Wootton, E. Michael Gertz, Richa Agarwala, Aleksandr Morgulis,
Alejandro A. Schaffer, and Yi-Kuo Yu (2005) "Protein database searches
using compositionally adjusted substitution matrices", FEBS J. 272:5101-5109.
Query= Lj2g3v0486970.2 Non Chatacterized Hit- tr|I1MQ84|I1MQ84_SOYBN
Uncharacterized protein OS=Glycine max GN=Gma.26345
PE,91.09,0,seg,NULL; CPSF_A,Cleavage/polyadenylation specificity
factor, A subunit, C-terminal; MMS1_N,NULL; CL,CUFF.34661.2
(1100 letters)
Database: TAIR10_pep
35,386 sequences; 14,482,855 total letters
Searching..................................................done
Score E
Sequences producing significant alignments: (bits) Value
AT5G51660.1 | Symbols: CPSF160, ATCPSF160 | cleavage and polyade... 1625 0.0
AT4G05420.2 | Symbols: DDB1A | damaged DNA binding protein 1A | ... 89 2e-17
AT4G05420.1 | Symbols: DDB1A | damaged DNA binding protein 1A | ... 89 2e-17
AT4G21100.1 | Symbols: DDB1B | damaged DNA binding protein 1B | ... 86 2e-16
AT3G55200.1 | Symbols: | Cleavage and polyadenylation specifici... 73 1e-12
AT3G55220.1 | Symbols: | Cleavage and polyadenylation specifici... 73 1e-12
>AT5G51660.1 | Symbols: CPSF160, ATCPSF160 | cleavage and
polyadenylation specificity factor 160 |
chr5:20980250-20989268 FORWARD LENGTH=1442
Length = 1442
Score = 1625 bits (4208), Expect = 0.0, Method: Compositional matrix adjust.
Identities = 783/1101 (71%), Positives = 912/1101 (82%), Gaps = 17/1101 (1%)
Query: 1 MPRSFFNVELDAANATWLSNDVAXXXXXXXXXXXXXXIFDGRVVQRLDLSKSKASVLSSG 60
+P S F+VELDAA+ TW+SNDVA I+DGR VQRLDLSKSKASVL+S
Sbjct: 358 LPASNFSVELDAAHGTWISNDVALLSTKSGELLLLTLIYDGRAVQRLDLSKSKASVLASD 417
Query: 61 ITTIGNSLFFLASRLGDSMLVQFSCGSSVSMLSSNLKEEVGDIEGDASSTKRLRRSPSDS 120
IT++GNSLFFL SRLGDS+LVQFSC S + L++E DIEG+ KRLR + SD+
Sbjct: 418 ITSVGNSLFFLGSRLGDSLLVQFSCRSGPAASLPGLRDEDEDIEGEGHQAKRLRMT-SDT 476
Query: 121 LHDMVSGEELSLYGSAPNRTESAQKSFSFAVRDSLINIGPLKDFSYGLRINADANATGIA 180
D + EELSL+GS PN ++SAQKSFSFAVRDSL+N+GP+KDF+YGLRINADANATG++
Sbjct: 477 FQDTIGNEELSLFGSTPNNSDSAQKSFSFAVRDSLVNVGPVKDFAYGLRINADANATGVS 536
Query: 181 KQSNYELVCCSGHGKNGSLCVLRQSIRPEVITEVELPGCKGIWTVYHKSARSHISDSSKL 240
KQSNYELVCCSGHGKNG+LCVLRQSIRPE+ITEVELPGCKGIWTVYHKS+R H +DSSK+
Sbjct: 537 KQSNYELVCCSGHGKNGALCVLRQSIRPEMITEVELPGCKGIWTVYHKSSRGHNADSSKM 596
Query: 241 ADDDDEYHAYLIISLEARTMVLETADLLSEVTESVDYYVQGKTLAAGNLFGRRRVIQVYE 300
A D+DEYHAYLIISLEARTMVLETADLL+EVTESVDYYVQG+T+AAGNLFGRRRVIQV+E
Sbjct: 597 AADEDEYHAYLIISLEARTMVLETADLLTEVTESVDYYVQGRTIAAGNLFGRRRVIQVFE 656
Query: 301 RGARILDGSFMTQDIXXXXXXXXXXXXXXXALALSVSIADPYVLLRMSDGSIRLLVGDPS 360
GARILDGSFM Q++ + SVSIADPYVLLRM+D SIRLLVGDPS
Sbjct: 657 HGARILDGSFMNQELSFGASNSESNSGSESSTVSSVSIADPYVLLRMTDDSIRLLVGDPS 716
Query: 361 TCTISVTXXXXXXXXXXXXXXCTLYHDKGPEPWLRKTSTDAWLSTGVGEAIDGTDGAPQD 420
TCT+S++ CTLYHDKGPEPWLRK STDAWLS+GVGEA+D DG PQD
Sbjct: 717 TCTVSISSPSVLEGSKRKISACTLYHDKGPEPWLRKASTDAWLSSGVGEAVDSVDGGPQD 776
Query: 421 HGDIYCVVCYENGNLEIFDVPNFSCVFSVENFMSGKSHLVDALTKEVAKDSQKGDKGSDA 480
GDIYCVVCYE+G LEIFDVP+F+CVFSV+ F SG+ HL D E+ + K +
Sbjct: 777 QGDIYCVVCYESGALEIFDVPSFNCVFSVDKFASGRRHLSDMPIHELEYELNKNSED--- 833
Query: 481 VANQGRKENVLNMKVVELAMQRWSGQHSRPFLFGILSDGTILCYHAYLYESPDGTSKVED 540
N KE + N +VVELAMQRWSG H+RPFLF +L+DGTILCYHAYL++ D T K E+
Sbjct: 834 --NTSSKE-IKNTRVVELAMQRWSGHHTRPFLFAVLADGTILCYHAYLFDGVDST-KAEN 889
Query: 541 SVSASGPVDLSSTSVSRLRNLRFVRLPLDAYPREETSNGSPGQHITIFKNIGSYEGFFLS 600
S+S+ P L+S+ S+LRNL+F+R+PLD RE TS+G Q IT+FKNI ++GFFLS
Sbjct: 890 SLSSENPAALNSSGSSKLRNLKFLRIPLDTSTREGTSDGVASQRITMFKNISGHQGFFLS 949
Query: 601 GSRPAWVMVLRERLRVHPQLCDGSILAFTVLHNVNCNHGLIYVTSQGVLKICQLPTGSNY 660
GSRP W M+ RERLR H QLCDGSI AFTVLHNVNCNHG IYVT+QGVLKICQLP+ S Y
Sbjct: 950 GSRPGWCMLFRERLRFHSQLCDGSIAAFTVLHNVNCNHGFIYVTAQGVLKICQLPSASIY 1009
Query: 661 DSHWPVQKVPLKATPHQVTYFAEKNLYPLIVSFPVLKPLSQVVS-LVDPDANHQTENPNL 719
D++WPVQK+PLKATPHQVTY+AEKNLYPLIVS+PV KPL+QV+S LVD +A Q +N N+
Sbjct: 1010 DNYWPVQKIPLKATPHQVTYYAEKNLYPLIVSYPVSKPLNQVLSSLVDQEAGQQLDNHNM 1069
Query: 720 NSDEQNRFYTVDEFEVRIMEPEKSGGPWQTKATIPMQSSENALTVKMVTLVNTTSKENET 779
+SD+ R YTV+EFE++I+EPE+SGGPW+TKA IPMQ+SE+ALTV++VTL+N ++ ENET
Sbjct: 1070 SSDDLQRTYTVEEFEIQILEPERSGGPWETKAKIPMQTSEHALTVRVVTLLNASTGENET 1129
Query: 780 LLAVGTAYVQGEDVAARGRILLFSLGKNTDNPQNLVSEVYSKESKGDVSALASLQGHLLI 839
LLAVGTAYVQGEDVAARGR+LLFS GKN DN QN+V+EVYS+E KG +SA+AS+QGHLLI
Sbjct: 1130 LLAVGTAYVQGEDVAARGRVLLFSFGKNGDNSQNVVTEVYSRELKGAISAVASIQGHLLI 1189
Query: 840 ASGPKITLHKWTGTELTGIAFFDAPPLHVVSLNIVKNFILIGDVHKSIYFLSWKEQGAQL 899
+SGPKI LHKW GTEL G+AFFDAPPL+VVS+N+VK+FIL+GDVHKSIYFLSWKEQG+QL
Sbjct: 1190 SSGPKIILHKWNGTELNGVAFFDAPPLYVVSMNVVKSFILLGDVHKSIYFLSWKEQGSQL 1249
Query: 900 NLLAKDFGSLNCFATEFLIDGSTLSLMVSDDQKNIQIFYYAPKMSESWKGQKLLSRAEFH 959
+LLAKDF SL+CFATEFLIDGSTLSL VSD+QKNIQ+FYYAPKM ESWKG KLLSRAEFH
Sbjct: 1250 SLLAKDFESLDCFATEFLIDGSTLSLAVSDEQKNIQVFYYAPKMIESWKGLKLLSRAEFH 1309
Query: 960 VGAHVTKFLRLQMLSTSDRTGAGPGSDKTNRFALLFGTLDGSIGCIAPLDEITFRRLQSL 1019
VGAHV+KFLRLQM+S+ G+DK NRFALLFGTLDGS GCIAPLDE+TFRRLQSL
Sbjct: 1310 VGAHVSKFLRLQMVSS--------GADKINRFALLFGTLDGSFGCIAPLDEVTFRRLQSL 1361
Query: 1020 QRKLVDAVPHVAGLNPRAFRQFNSNGKAHRPGPDSIVDCELLCHYEMLPLEEQLEIAHLI 1079
Q+KLVDAVPHVAGLNP AFRQF S+GKA R GPDSIVDCELLCHYEMLPLEEQLE+AH I
Sbjct: 1362 QKKLVDAVPHVAGLNPLAFRQFRSSGKARRSGPDSIVDCELLCHYEMLPLEEQLELAHQI 1421
Query: 1080 GTTRSQILTNLSDLSLGTSFL 1100
GTTR IL +L DLS+GTSFL
Sbjct: 1422 GTTRYSILKDLVDLSVGTSFL 1442
>AT4G05420.2 | Symbols: DDB1A | damaged DNA binding protein 1A |
chr4:2746288-2752663 FORWARD LENGTH=1067
Length = 1067
Score = 88.6 bits (218), Expect = 2e-17, Method: Compositional matrix adjust.
Identities = 87/310 (28%), Positives = 144/310 (46%), Gaps = 34/310 (10%)
Query: 747 WQTKATIPMQSSENALTVKMVTLVNTTSKENETLLAVGTAYV-QGEDVAARGRILLFSLG 805
++ +T P+ S E ++ L + +++ VGTAYV E+ +GRIL+F +
Sbjct: 733 FEFMSTYPLDSFEYGCSI----LSCSFTEDKNVYYCVGTAYVLPEENEPTKGRILVFIV- 787
Query: 806 KNTDNPQNLVSEVYSKESKGDVSALASLQGHLLIASGPKITLHKWT----GT-ELTGIAF 860
D L++E KE+KG V +L + G LL A KI L+KW GT EL
Sbjct: 788 --EDGRLQLIAE---KETKGAVYSLNAFNGKLLAAINQKIQLYKWMLRDDGTRELQSECG 842
Query: 861 FDAPPLHVVSLNIVK--NFILIGDVHKSIYFLSWKEQGAQLNLLAKDFGSLNCFATEFLI 918
H+++L + +FI++GD+ KSI L +K + + A+D+ + A E L
Sbjct: 843 HHG---HILALYVQTRGDFIVVGDLMKSISLLLYKHEEGAIEERARDYNANWMSAVEILD 899
Query: 919 DGSTLSLMVSDDQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLR--LQMLSTS 976
D L +++ N+ + + + +L E+H+G V +F L M
Sbjct: 900 DDIYLG---AENNFNLLTVKKNSEGATDEERGRLEVVGEYHLGEFVNRFRHGSLVMRLPD 956
Query: 977 DRTGAGPGSDKTNRFALLFGTLDGSIGCIAPLDEITFRRLQSLQRKLVDAVPHVAGLNPR 1036
G P ++FGT++G IG IA L + + L+ LQ L + V GL+
Sbjct: 957 SEIGQIP--------TVIFGTVNGVIGVIASLPQEQYTFLEKLQSSLRKVIKGVGGLSHE 1008
Query: 1037 AFRQFNSNGK 1046
+R FN+ +
Sbjct: 1009 QWRSFNNEKR 1018
>AT4G05420.1 | Symbols: DDB1A | damaged DNA binding protein 1A |
chr4:2746288-2752663 FORWARD LENGTH=1088
Length = 1088
Score = 88.6 bits (218), Expect = 2e-17, Method: Compositional matrix adjust.
Identities = 87/310 (28%), Positives = 144/310 (46%), Gaps = 34/310 (10%)
Query: 747 WQTKATIPMQSSENALTVKMVTLVNTTSKENETLLAVGTAYV-QGEDVAARGRILLFSLG 805
++ +T P+ S E ++ L + +++ VGTAYV E+ +GRIL+F +
Sbjct: 754 FEFMSTYPLDSFEYGCSI----LSCSFTEDKNVYYCVGTAYVLPEENEPTKGRILVFIV- 808
Query: 806 KNTDNPQNLVSEVYSKESKGDVSALASLQGHLLIASGPKITLHKWT----GT-ELTGIAF 860
D L++E KE+KG V +L + G LL A KI L+KW GT EL
Sbjct: 809 --EDGRLQLIAE---KETKGAVYSLNAFNGKLLAAINQKIQLYKWMLRDDGTRELQSECG 863
Query: 861 FDAPPLHVVSLNIVK--NFILIGDVHKSIYFLSWKEQGAQLNLLAKDFGSLNCFATEFLI 918
H+++L + +FI++GD+ KSI L +K + + A+D+ + A E L
Sbjct: 864 HHG---HILALYVQTRGDFIVVGDLMKSISLLLYKHEEGAIEERARDYNANWMSAVEILD 920
Query: 919 DGSTLSLMVSDDQKNIQIFYYAPKMSESWKGQKLLSRAEFHVGAHVTKFLR--LQMLSTS 976
D L +++ N+ + + + +L E+H+G V +F L M
Sbjct: 921 DDIYLG---AENNFNLLTVKKNSEGATDEERGRLEVVGEYHLGEFVNRFRHGSLVMRLPD 977
Query: 977 DRTGAGPGSDKTNRFALLFGTLDGSIGCIAPLDEITFRRLQSLQRKLVDAVPHVAGLNPR 1036
G P ++FGT++G IG IA L + + L+ LQ L + V GL+
Sbjct: 978 SEIGQIP--------TVIFGTVNGVIGVIASLPQEQYTFLEKLQSSLRKVIKGVGGLSHE 1029
Query: 1037 AFRQFNSNGK 1046
+R FN+ +
Sbjct: 1030 QWRSFNNEKR 1039
>AT4G21100.1 | Symbols: DDB1B | damaged DNA binding protein 1B |
chr4:11258916-11265309 REVERSE LENGTH=1088
Length = 1088
Score = 85.5 bits (210), Expect = 2e-16, Method: Compositional matrix adjust.
Identities = 80/273 (29%), Positives = 128/273 (46%), Gaps = 26/273 (9%)
Query: 782 AVGTAYV-QGEDVAARGRILLFSLGKNTDNPQNLVSEVYSKESKGDVSALASLQGHLLIA 840
VGTAYV E+ +GRIL+F + + L++E KE+KG V +L + G LL +
Sbjct: 785 CVGTAYVLPEENEPTKGRILVFIV---EEGRLQLITE---KETKGAVYSLNAFNGKLLAS 838
Query: 841 SGPKITLHKWT----GT-ELTGIAFFDAPPLHVVSLNIVK--NFILIGDVHKSIYFLSWK 893
KI L+KW GT EL H+++L + +FI +GD+ KSI L +K
Sbjct: 839 INQKIQLYKWMLRDDGTRELQSECGHHG---HILALYVQTRGDFIAVGDLMKSISLLIYK 895
Query: 894 EQGAQLNLLAKDFGSLNCFATEFLIDGSTLSLMVSDDQKNIQIFYYAPKMSESWKGQKLL 953
+ + A+D+ + A E L D L +D+ NI + + + ++
Sbjct: 896 HEEGAIEERARDYNANWMTAVEILNDDIYLG---TDNCFNIFTVKKNNEGATDEERARME 952
Query: 954 SRAEFHVGAHVTKFLRLQMLSTSDRTGAGPGSDKTNRFALLFGTLDGSIGCIAPLDEITF 1013
E+H+G V +F ++ P SD ++FGT+ G IG IA L + +
Sbjct: 953 VVGEYHIGEFVNRFRHGSLVMKL------PDSDIGQIPTVIFGTVSGMIGVIASLPQEQY 1006
Query: 1014 RRLQSLQRKLVDAVPHVAGLNPRAFRQFNSNGK 1046
L+ LQ L + V GL+ +R FN+ +
Sbjct: 1007 AFLEKLQTSLRKVIKGVGGLSHEQWRSFNNEKR 1039
>AT3G55200.1 | Symbols: | Cleavage and polyadenylation specificity
factor (CPSF) A subunit protein | chr3:20460533-20464361
FORWARD LENGTH=1214
Length = 1214
Score = 72.8 bits (177), Expect = 1e-12, Method: Compositional matrix adjust.
Identities = 86/380 (22%), Positives = 164/380 (43%), Gaps = 56/380 (14%)
Query: 735 VRIMEPEKSGGPWQTKATIPMQSSENALTVKMVTLVNTTSKENETLLAVGTAYVQGEDVA 794
+R+++P+ + T + +Q +E A +V VN KE TLLAVGT V+G
Sbjct: 863 IRVLDPKTA----TTTCLLELQDNEAAYSV---CTVNFHDKEYGTLLAVGT--VKGMQFW 913
Query: 795 ARGRIL--LFSLGKNTDNPQNLVSEVYSKESKGDVSALASLQGHLLIASGPKITLHKWTG 852
+ ++ + + ++ ++L ++ + +G AL QG LL GP + L+
Sbjct: 914 PKKNLVAGFIHIYRFVEDGKSL-ELLHKTQVEGVPLALCQFQGRLLAGIGPVLRLYDLGK 972
Query: 853 TELTGIAFFDAPPLHVVSLNIVKNFILIGDVHKSIYFLSWKEQGAQLNLLAKD------- 905
L P ++S+ ++ I +GD+ +S ++ ++ QL + A D
Sbjct: 973 KRLLRKCENKLFPNTIISIQTYRDRIYVGDIQESFHYCKYRRDENQLYIFADDCVPRWLT 1032
Query: 906 ---------FGSLNCFATEFLID-GSTLSLMVSDDQKNIQIFYYAPKMSESWKGQKLLSR 955
+ F + + LS + +D +I + K++ + K+
Sbjct: 1033 ASHHVDFDTMAGADKFGNVYFVRLPQDLSEEIEEDPTGGKIKWEQGKLNGA--PNKVDEI 1090
Query: 956 AEFHVGAHVTKFLRLQMLSTSDRTGAGPGSDKTNRFALLFGTLDGSIGCIAPL---DEIT 1012
+FHVG VT + M+ PG ++ +++GT+ GSIG + D++
Sbjct: 1091 VQFHVGDVVTCLQKASMI---------PGGSES----IMYGTVMGSIGALHAFTSRDDVD 1137
Query: 1013 FRRLQSLQRKLVDAVPHVAGLNPRAFRQFNSNGKAHRPGPDSIVDCELLCHYEMLPLEEQ 1072
F L+ + P + G + A+R A+ P D ++D +L + LP++ Q
Sbjct: 1138 F--FSHLEMHMRQEYPPLCGRDHMAYR------SAYFPVKD-VIDGDLCEQFPTLPMDLQ 1188
Query: 1073 LEIAHLIGTTRSQILTNLSD 1092
+IA + T ++IL L D
Sbjct: 1189 RKIADELDRTPAEILKKLED 1208
>AT3G55220.1 | Symbols: | Cleavage and polyadenylation specificity
factor (CPSF) A subunit protein | chr3:20467116-20470944
REVERSE LENGTH=1214
Length = 1214
Score = 72.8 bits (177), Expect = 1e-12, Method: Compositional matrix adjust.
Identities = 86/380 (22%), Positives = 164/380 (43%), Gaps = 56/380 (14%)
Query: 735 VRIMEPEKSGGPWQTKATIPMQSSENALTVKMVTLVNTTSKENETLLAVGTAYVQGEDVA 794
+R+++P+ + T + +Q +E A +V VN KE TLLAVGT V+G
Sbjct: 863 IRVLDPKTA----TTTCLLELQDNEAAYSV---CTVNFHDKEYGTLLAVGT--VKGMQFW 913
Query: 795 ARGRIL--LFSLGKNTDNPQNLVSEVYSKESKGDVSALASLQGHLLIASGPKITLHKWTG 852
+ ++ + + ++ ++L ++ + +G AL QG LL GP + L+
Sbjct: 914 PKKNLVAGFIHIYRFVEDGKSL-ELLHKTQVEGVPLALCQFQGRLLAGIGPVLRLYDLGK 972
Query: 853 TELTGIAFFDAPPLHVVSLNIVKNFILIGDVHKSIYFLSWKEQGAQLNLLAKD------- 905
L P ++S+ ++ I +GD+ +S ++ ++ QL + A D
Sbjct: 973 KRLLRKCENKLFPNTIISIQTYRDRIYVGDIQESFHYCKYRRDENQLYIFADDCVPRWLT 1032
Query: 906 ---------FGSLNCFATEFLID-GSTLSLMVSDDQKNIQIFYYAPKMSESWKGQKLLSR 955
+ F + + LS + +D +I + K++ + K+
Sbjct: 1033 ASHHVDFDTMAGADKFGNVYFVRLPQDLSEEIEEDPTGGKIKWEQGKLNGA--PNKVDEI 1090
Query: 956 AEFHVGAHVTKFLRLQMLSTSDRTGAGPGSDKTNRFALLFGTLDGSIGCIAPL---DEIT 1012
+FHVG VT + M+ PG ++ +++GT+ GSIG + D++
Sbjct: 1091 VQFHVGDVVTCLQKASMI---------PGGSES----IMYGTVMGSIGALHAFTSRDDVD 1137
Query: 1013 FRRLQSLQRKLVDAVPHVAGLNPRAFRQFNSNGKAHRPGPDSIVDCELLCHYEMLPLEEQ 1072
F L+ + P + G + A+R A+ P D ++D +L + LP++ Q
Sbjct: 1138 F--FSHLEMHMRQEYPPLCGRDHMAYR------SAYFPVKD-VIDGDLCEQFPTLPMDLQ 1188
Query: 1073 LEIAHLIGTTRSQILTNLSD 1092
+IA + T ++IL L D
Sbjct: 1189 RKIADELDRTPAEILKKLED 1208