smtnkc commited on
Commit
2ef204f
·
1 Parent(s): 32fd68d

Plotting and custom reference data

Browse files
Files changed (8) hide show
  1. INSTRUCTIONS.md +38 -37
  2. app.py +101 -13
  3. output.csv +21 -0
  4. predict.py +23 -26
  5. reference_delta.csv +0 -0
  6. requirements.txt +1 -0
  7. response.json +0 -132
  8. target.csv +13 -3
INSTRUCTIONS.md CHANGED
@@ -1,52 +1,51 @@
1
  ### Running Instructions
2
 
3
- 1. Upload a CSV file with the target sequences. CSV file must have ``accession_id`` and ``sequence`` columns. See [example target file](https://huggingface.co/spaces/smtnkc/cov-snn-app/resolve/main/target.csv) which includes 5 New Omicron (``EPI_ISL_177...``) and 5 Eris (``EPI_ISL_189...``) sequences.
4
-
5
- 2. Application will compare the given sequences with the average Omicron embedding. This embedding has been generated using [2000 Omicron sequences](https://huggingface.co/spaces/smtnkc/cov-snn-app/resolve/main/omicron.csv).
6
-
7
- 3. You will get a JSON response in the following format:
8
-
9
- ```json
10
- {
11
- "EPI_ISL_18905639": {
12
- "sc": 410.788391,
13
- "sp": 0.000113,
14
- "ip": 9e-05,
15
- "log10(sc)": 2.613618,
16
- "log10(sp)": -3.946409,
17
- "log10(ip)": -4.047602,
18
- "rank_by_sc": 2,
19
- "rank_by_sp": 3,
20
- "rank_by_ip": 2,
21
- "rank_by_scsp": 5,
22
- "rank_by_scip": 4
23
- },
24
- ...
25
- }
26
- ```
27
 
28
- where:
29
 
30
- * ``sc``: semantic change
31
- * ``sp``: sequence probability
32
- * ``ip``: inverse perplexity
33
- * ``rank_by_sc``: rank by semantic change (In descending order)
34
- * ``rank_by_sp``: rank by sequence probability (In descending order)
35
- * ``rank_by_ip``: rank by inverse perplexity (In descending order)
36
- * ``rank_by_scsp``: rank by semantic change + rank by sequence probability (Both in descending order)
37
- * ``rank_by_scip``: rank by semantic change + rank by inverse perplexity (Both in descending order)
38
 
 
39
 
40
- ### The Ranking Mechanism
 
 
 
 
 
 
41
 
42
- We measure grammaticality using either sequence probability (`sp`) or inverse perplexity (`ip`). Sequences that have a higher semantic change (`sc`) and higher grammaticality tend to have a higher escape potential. We rank sequences in descending order, meaning sequences with smaller rank values have a higher escape potential. In the JSON response, the sequences are sorted by ascending `rank_by_scsp`. Therefore, the top element has the smallest `rank_by_scsp`, representing the sequence with the highest escape potential.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
- See [the response](https://huggingface.co/spaces/smtnkc/cov-snn-app/resolve/main/response.json) for [the example target file](https://huggingface.co/spaces/smtnkc/cov-snn-app/resolve/main/target.csv).
45
 
 
46
 
47
  ### Model Details
48
 
49
- This application uses the model checkpoint with the highest zero-shot test accuracy (91.5%).
50
 
51
  #### Training Parameters:
52
 
@@ -65,6 +64,7 @@ This application uses the model checkpoint with the highest zero-shot test accur
65
  | Margin | 2.0 |
66
  | Epochs | [0, 9] |
67
 
 
68
 
69
  #### Training Results:
70
 
@@ -92,6 +92,7 @@ tokenizers==0.13.3
92
  scanpy==1.9.3
93
  scikit-learn==1.2.2
94
  scipy==1.10.1
 
95
  torch-optimizer==0.3.0
96
  torchmetrics==0.9.0
97
  torch==1.12.1+cu113
 
1
  ### Running Instructions
2
 
3
+ ##### 1. Set reference dataset
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
+ Application compares the target sequences with the average embedding of the reference sequences. The reference sequences can be `Omicron` or `Other`:
6
 
7
+ * If you select `Omicron`, application uses the average embedding of [2000 Omicron sequences](https://huggingface.co/spaces/smtnkc/cov-snn-app/resolve/main/omicron.csv). This embedding is already generated. Thus, you do not need to upload a reference dataset.
 
 
 
 
 
 
 
8
 
9
+ * If you select `Other`, you should upload a CSV file with the reference sequences. CSV file must have ``sequence`` column. Then, the model generates average embedding of the given sequences and uses it as a reference.
10
 
11
+ ##### 2. Set target dataset
12
+
13
+ Upload a CSV file with the target sequences. CSV file must have ``accession_id`` and ``sequence`` columns.
14
+
15
+ See [the example target file](https://huggingface.co/spaces/smtnkc/cov-snn-app/resolve/main/target.csv) which includes 10 Omicron (``EPI_ISL_177...``) and 10 Eris (``EPI_ISL_189...``) sequences. It is important to note that the selected Omicron sequences are currently circulating and are not included in the training dataset.
16
+
17
+ ##### 3. Get output
18
 
19
+ The output will be a dataframe with the following columns:
20
+
21
+ | Column | Description |
22
+ |------------------|-------------------------------------------------------------|
23
+ | `accession_id` | Accession ID |
24
+ | `log10(sc)` | Log-scaled semantic change |
25
+ | `log10(sp)` | Log-scaled sequence probability |
26
+ | `log10(ip)` | Log-scaled inverse perplexity |
27
+ | `log10(gr)` | Log-scaled grammaticality where `gr = (sp + ip) / 2` |
28
+ | `rank_by_sc` | Rank by semantic change |
29
+ | `rank_by_sp` | Rank by sequence probability |
30
+ | `rank_by_ip` | Rank by inverse perplexity |
31
+ | `rank_by_gr` | Rank by grammaticality |
32
+ | `rank_by_scsp` | Rank by semantic change + Rank by sequence probability |
33
+ | `rank_by_scip` | Rank by semantic change + Rank by inverse perplexity |
34
+ | `rank_by_scgr` | Rank by semantic change + Rank by grammaticality |
35
+
36
+ **Note:** All ranks are in descending order, with the default sorting metric being `rank_by_scgr`.
37
+
38
+ See [the output](https://huggingface.co/spaces/smtnkc/cov-snn-app/resolve/main/output.csv) for [the example target file](https://huggingface.co/spaces/smtnkc/cov-snn-app/resolve/main/target.csv).
39
+
40
+ ### The Ranking Mechanism
41
 
42
+ In the original implementation of [Constrained Semantic Change Search](https://www.science.org/doi/10.1126/science.abd7331) (CSCS), grammaticality (`gr`) is determined by sequence probability (`sp`). We propose a more robust metric for grammaticality by averaging sequence probability (`sp`) and inverse perplexity (`ip`).
43
 
44
+ Sequences with both high semantic change (`sc`) and high grammaticality (`gr`) are expected to have a greater escape potential. We rank the sequences in descending order, assigning smaller rank values to those with higher escape potential. Consequently, the output is sorted based on `rank_by_scgr`, with the top element possessing the smallest `rank_by_scgr` and indicating the sequence with the highest escape potential.
45
 
46
  ### Model Details
47
 
48
+ This application uses the pre-trained model with the highest zero-shot test accuracy (91.5%).
49
 
50
  #### Training Parameters:
51
 
 
64
  | Margin | 2.0 |
65
  | Epochs | [0, 9] |
66
 
67
+ To train models for specific use cases, please refer to the instructions in [our GitHub repository](https://github.com/smtnkc/CoV-SNN).
68
 
69
  #### Training Results:
70
 
 
92
  scanpy==1.9.3
93
  scikit-learn==1.2.2
94
  scipy==1.10.1
95
+ plotly==5.24.1
96
  torch-optimizer==0.3.0
97
  torchmetrics==0.9.0
98
  torch==1.12.1+cu113
app.py CHANGED
@@ -1,8 +1,9 @@
1
  import streamlit as st
 
2
  import pandas as pd
3
- import json
4
  import numpy as np
5
- from predict import process_target_data # Import your function
6
 
7
 
8
  st.set_page_config(page_title="CoV-SNN", page_icon="🧬")
@@ -18,23 +19,110 @@ def main():
18
  except FileNotFoundError:
19
  readme_text = "INSTRUCTIONS.md file not found."
20
 
21
- st.markdown("Upload target sequences. Make sure the CSV file has ``accession_id`` and ``sequence`` columns.")
 
 
 
 
 
 
 
 
22
 
23
  # File uploader for the target.csv
24
- uploaded_file = st.file_uploader("Upload target sequences.", type=["csv"], label_visibility="collapsed")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
- if uploaded_file is not None:
27
- # Read the uploaded CSV file into a DataFrame
28
- target_dataset = pd.read_csv(uploaded_file)
29
 
30
- # Assuming you have a pre-defined average_embedding
31
- average_embedding = np.load("average_omicron_embedding.npy")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
- # Process the target dataset
34
- results_dict = process_target_data(average_embedding, target_dataset)
35
 
36
- # Display results as JSON
37
- st.json(results_dict)
 
 
 
38
 
39
  # Display the README.md file
40
  st.markdown(readme_text)
 
1
  import streamlit as st
2
+ import os
3
  import pandas as pd
4
+ import plotly.express as px
5
  import numpy as np
6
+ from predict import process_target_data, get_average_embedding # Import your function
7
 
8
 
9
  st.set_page_config(page_title="CoV-SNN", page_icon="🧬")
 
19
  except FileNotFoundError:
20
  readme_text = "INSTRUCTIONS.md file not found."
21
 
22
+ option = st.radio(
23
+ "Select a reference embedding:",
24
+ ["Omicron", "Other"],
25
+ captions=["Use average embedding of Omicron sequences (Pre-generated)", "Generate average embedding of your own sequences (Takes longer)"],)
26
+
27
+ # File uploader for the reference.csv
28
+ reference_file = st.file_uploader("Upload reference sequences. Make sure the CSV file has ``sequence`` column.",
29
+ type=["csv"],
30
+ disabled=option == "Omicron")
31
 
32
  # File uploader for the target.csv
33
+ target_file = st.file_uploader("Upload target sequences. Make sure the CSV file has ``accession_id`` and ``sequence`` columns.",
34
+ type=["csv"],
35
+ disabled = option == "Other" and reference_file is None)
36
+
37
+ if target_file is not None and (option == "Omicron" or reference_file is not None):
38
+
39
+ if option == "Omicron":
40
+ # Assuming you have a pre-defined average_embedding
41
+ average_embedding = np.load("average_omicron_embedding.npy")
42
+ print(f"Average Omicron embedding loaded from file with shape {average_embedding.shape}")
43
+ else:
44
+ with st.spinner('Calculating average embedding...'):
45
+ ref_df = pd.read_csv(reference_file)
46
+ average_embedding = get_average_embedding(ref_df)
47
+
48
+ with st.spinner('Predicting escape potentials...'):
49
+ # Read the uploaded CSV file into a DataFrame
50
+ target_dataset = pd.read_csv(target_file)
51
+
52
+ # Process the target dataset
53
+ results_df = process_target_data(average_embedding, target_dataset)
54
+
55
+ # Reverse the rank_sc_sp by subtracting it from the maximum rank value plus one
56
+ results_df['Escape Potential'] = results_df['rank_by_scgr'].max() + 1 - results_df['rank_by_scgr']
57
+
58
+ # Create scatter plot with manual color assignment
59
+ fig = px.scatter(
60
+ results_df.applymap(lambda x: round(x, 6) if isinstance(x, (int, float)) else x),
61
+ x="log10(gr)",
62
+ y="log10(sc)",
63
+ labels={"log10(gr)": "log10(gr)", "log10(sc)": "log10(sc)"},
64
+ title="CoV-SNN Results",
65
+ hover_name="accession_id",
66
+ color="Escape Potential",
67
+ color_continuous_scale=["green", "yellow", "red"],
68
+ hover_data={
69
+ "log10(sp)": True, # display log10(sp)
70
+ "log10(sc)": True, # display log10(sc)
71
+ "log10(ip)": True, # display log10(ip)
72
+ "log10(gr)": True, # display log10(gr)
73
+ "sp": False, # display actual sp
74
+ "sc": False, # display actual sc
75
+ "ip": False, # display actual ip
76
+ "gr": False, # display actual gr
77
+ "rank_by_sc": True, # display rank by sc
78
+ "rank_by_sp": True, # display rank by sp
79
+ "rank_by_ip": True, # display rank by ip
80
+ "rank_by_scsp": True, # display rank by scsp
81
+ "rank_by_scip": True, # display rank by scip
82
+ "rank_by_scgr": True, # display rank by scgr
83
+ "Escape Potential": False
84
+ },
85
+ )
86
+
87
+ # Hide the colorbar ticks and labels
88
+ fig.update_coloraxes(
89
+ colorbar=dict(
90
+ title=None,
91
+ tickvals=[],
92
+ ticktext=[],
93
+ y=0.5,
94
+ len=0.7
95
+ )
96
+ )
97
 
98
+ # Hide the legend
99
+ #fig.update_layout(showlegend=False)
 
100
 
101
+ # add your rotated title via annotations
102
+ fig.update_layout(
103
+ margin=dict(r=110),
104
+ annotations=[
105
+ dict(
106
+ text="Escape Potential",
107
+ font_size=14,
108
+ textangle=270,
109
+ showarrow=False,
110
+ xref="paper",
111
+ yref="paper",
112
+ x=1.14,
113
+ y=0.5
114
+ )
115
+ ]
116
+ )
117
 
118
+ # Display the plot in Streamlit
119
+ st.plotly_chart(fig, theme="streamlit", border=True, use_container_width=True, border_color="black")
120
 
121
+ # Display the results as a DataFrame
122
+ st.dataframe(results_df[["accession_id", "log10(sc)", "log10(sp)", "log10(ip)",
123
+ "log10(gr)", "rank_by_sc", "rank_by_sp",
124
+ "rank_by_ip", "rank_by_gr", "rank_by_scsp", "rank_by_scip",
125
+ "rank_by_scgr"]], hide_index=True)
126
 
127
  # Display the README.md file
128
  st.markdown(readme_text)
output.csv ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ accession_id,sc,sp,ip,gr,log10(sc),log10(sp),log10(ip),log10(gr),rank_by_sc,rank_by_sp,rank_by_ip,rank_by_gr,rank_by_scsp,rank_by_scip,rank_by_scgr
2
+ EPI_ISL_18986233,412.00534,0.0001191464722675543,8.76367634168547e-05,0.0001033916178422045,2.6149027,-3.9239188118774555,-4.05731367009075,-3.9855146689058785,1,1,1,1,2,2,2
3
+ EPI_ISL_18986226,410.80124,0.00010675135854398832,8.531846659052994e-05,9.603491256725913e-05,2.6136317,-3.971626589329606,-4.068956958628795,-4.017570854680065,3,6,5,5,9,8,8
4
+ EPI_ISL_18986234,410.80112,0.0001105443762670885,8.57568435049827e-05,9.81506098860356e-05,2.6136317,-3.956463346421271,-4.066731212651265,-4.00810699744638,4,4,4,4,8,8,8
5
+ EPI_ISL_18986236,410.79172,0.00011333886235977685,8.645286187255717e-05,9.989586211616702e-05,2.6136217,-3.945621150908842,-4.063220625583959,-4.000452500736318,6,2,2,2,8,8,8
6
+ EPI_ISL_18905639,410.78833,0.00011313336462511583,8.597057589875352e-05,9.955197026193468e-05,2.6136181,-3.946409296615614,-4.065650164004875,-4.001950140303386,7,3,3,3,10,10,10
7
+ EPI_ISL_18986282,410.79742,0.0001021404935454484,8.483489298915e-05,9.348769326729921e-05,2.6136277,-3.990802047953487,-4.071425483641914,-4.029245555949711,5,7,6,7,12,11,12
8
+ EPI_ISL_18905700,410.80365,8.802011961961398e-05,8.242524698194001e-05,8.5222683300777e-05,2.6136343,-4.0554180455651325,-4.083939742847436,-4.069444795829361,2,11,10,11,13,12,13
9
+ EPI_ISL_18905699,410.78317,0.00010041191758079406,8.452680671922077e-05,9.246936215000741e-05,2.6136127,-3.9982147390593457,-4.0730055376304914,-4.034002138106995,9,8,7,8,17,16,17
10
+ EPI_ISL_18905723,410.78815,9.433468566337404e-05,8.384878363173483e-05,8.909173464755444e-05,2.613618,-4.0253285933445095,-4.0765032331873785,-4.050162585115344,8,10,8,10,18,16,18
11
+ EPI_ISL_18905548,402.63232,0.00010016525868650206,8.276867690551842e-05,9.146696779601024e-05,2.6049087,-3.999282883029627,-4.08213398713805,-4.038735717889792,10,9,9,9,19,19,19
12
+ EPI_ISL_17781217,4.1642776,0.00010857582674361765,8.239263049415624e-05,9.548422861888695e-05,0.6195397,-3.9642668550458704,-4.084111631492836,-4.020068356054232,14,5,11,6,19,25,20
13
+ EPI_ISL_17793474,4.1757736,8.611938045803635e-05,8.005577636043612e-05,8.308757840923624e-05,0.62073696,-4.064899103147481,-4.096607326443542,-4.08046389837658,12,13,19,15,25,31,27
14
+ EPI_ISL_17741975,4.177282,8.326421727057701e-05,8.009380361014998e-05,8.16790104403635e-05,0.6208938,-4.079541596200184,-4.09640108144401,-4.08788953246489,11,17,18,17,28,29,28
15
+ EPI_ISL_17793691,4.1488695,8.515223677162654e-05,8.163940861815575e-05,8.339582269489114e-05,0.61792976,-4.069803939541474,-4.088100149899155,-4.078855702671536,16,15,14,13,31,30,29
16
+ EPI_ISL_17793453,4.1394196,8.678013742050658e-05,8.199043962649191e-05,8.438528852349925e-05,0.6169394,-4.061579666480137,-4.086236784927503,-4.073733260364415,18,12,12,12,30,30,30
17
+ EPI_ISL_17793872,4.140045,8.529921736529407e-05,8.129646908951845e-05,8.329784322740626e-05,0.61700505,-4.069054953539316,-4.089928316499888,-4.079366243329856,17,14,16,14,31,33,31
18
+ EPI_ISL_17793635,4.1494923,8.446570391242857e-05,8.131957528945877e-05,8.289263960094368e-05,0.61799496,-4.073319594305262,-4.089804898232796,-4.081484030639229,15,16,15,16,31,30,31
19
+ EPI_ISL_17742593,4.1708717,7.075010398693848e-05,8.180073452905663e-05,7.627541925799757e-05,0.62022686,-4.1502729174867765,-4.087242796567261,-4.117615396521551,13,20,13,19,33,26,32
20
+ EPI_ISL_17793695,4.1184025,7.502111548092216e-05,8.087016295337835e-05,7.794563921715025e-05,0.6147288,-4.124816482659202,-4.0922116817851215,-4.108208177035818,20,18,17,18,38,37,38
21
+ EPI_ISL_17793481,4.1235456,7.246959033252027e-05,7.987709535991301e-05,7.617334284621663e-05,0.6152708,-4.1398441937228245,-4.097577736141062,-4.118196985094421,19,19,20,20,38,39,39
predict.py CHANGED
@@ -145,20 +145,16 @@ def calculate_inverse_perplexity(sentence):
145
 
146
 
147
 
148
-
149
- if os.path.exists("average_omicron_embedding.npy"):
150
- average_embedding = np.load("average_omicron_embedding.npy")
151
- print(f"Average Omicron embedding loaded from file with shape {average_embedding.shape}")
152
- else:
153
- omicron = pd.read_csv("omicron.csv")["sequence"].tolist()[:2000]
154
  embeddings = []
155
- for i, sentence in enumerate(omicron):
156
  emb = get_sentence_embedding(sentence)
157
  embeddings.append(emb)
158
- print(f"Embedding calculated for omicron sequence {i} with shape {emb.shape}")
159
  average_embedding = np.mean(embeddings, axis=0)
160
- # Save the average embedding to a file
161
- np.save("average_omicron_embedding.npy", average_embedding)
162
 
163
 
164
 
@@ -206,43 +202,44 @@ def get_results_dict(target_dataset, sc_scores, sp_scores, ip_scores):
206
  results_df["sp"] = sp_scores
207
  results_df["ip"] = ip_scores
208
 
 
 
 
209
  # add log10 scores
210
  results_df["log10(sc)"] = np.log10(results_df["sc"])
211
  results_df["log10(sp)"] = np.log10(results_df["sp"])
212
  results_df["log10(ip)"] = np.log10(results_df["ip"])
 
213
 
214
  # add rank_by_sc, rank_by_sp and rank_by_ip
215
  results_df["rank_by_sc"] = results_df["sc"].rank(ascending=False)
216
  results_df["rank_by_sp"] = results_df["sp"].rank(ascending=False)
217
  results_df["rank_by_ip"] = results_df["ip"].rank(ascending=False)
 
218
 
219
  # make ranks integers
220
  results_df["rank_by_sc"] = results_df["rank_by_sc"].astype(int)
221
  results_df["rank_by_sp"] = results_df["rank_by_sp"].astype(int)
222
  results_df["rank_by_ip"] = results_df["rank_by_ip"].astype(int)
 
223
 
224
-
225
- # add rank_by_sc_sp and rank_by_sc_ip by adding the ranks of sc and sp/ip
226
  results_df["rank_by_scsp"] = results_df["rank_by_sc"] + results_df["rank_by_sp"]
227
  results_df["rank_by_scip"] = results_df["rank_by_sc"] + results_df["rank_by_ip"]
 
228
 
229
-
230
- # sort by rank_by_sc_sp
231
- results_df = results_df.sort_values(by="rank_by_scsp")
232
-
233
-
234
- # Export the results to a JSON file
235
  results_df = results_df.drop(columns=["sequence"])
236
- results_dict = results_df.set_index("accession_id").applymap(lambda x: round(x, 6) if isinstance(x, (int, float)) else x).to_dict(orient="index")
237
- return results_dict
238
 
 
 
 
 
 
 
239
 
240
 
241
  def process_target_data(average_embedding, target_data):
242
  sc_scores, sp_scores, ip_scores = get_sc_sp_ip(average_embedding, target_data)
243
- results_dict = get_results_dict(target_data, sc_scores, sp_scores, ip_scores)
244
- return results_dict
245
-
246
- #out_file_name = f"response.json"
247
- #with open(out_file_name, "w") as f:
248
- # json.dump(results_dict, f, indent=4)
 
145
 
146
 
147
 
148
+ def get_average_embedding(ref_df):
149
+ ref_sequences = ref_df["sequence"].tolist()
150
+ print(f"Calculating average embedding for {len(ref_sequences)} reference sequences...")
 
 
 
151
  embeddings = []
152
+ for i, sentence in enumerate(ref_sequences):
153
  emb = get_sentence_embedding(sentence)
154
  embeddings.append(emb)
155
+ print(f"Embedding calculated for reference sequence {i} with shape {emb.shape}")
156
  average_embedding = np.mean(embeddings, axis=0)
157
+ return average_embedding
 
158
 
159
 
160
 
 
202
  results_df["sp"] = sp_scores
203
  results_df["ip"] = ip_scores
204
 
205
+ # Calculate the mean of sc and ip
206
+ results_df["gr"] = (results_df["sp"] + results_df["ip"]) / 2
207
+
208
  # add log10 scores
209
  results_df["log10(sc)"] = np.log10(results_df["sc"])
210
  results_df["log10(sp)"] = np.log10(results_df["sp"])
211
  results_df["log10(ip)"] = np.log10(results_df["ip"])
212
+ results_df['log10(gr)'] = np.log10(results_df['gr'])
213
 
214
  # add rank_by_sc, rank_by_sp and rank_by_ip
215
  results_df["rank_by_sc"] = results_df["sc"].rank(ascending=False)
216
  results_df["rank_by_sp"] = results_df["sp"].rank(ascending=False)
217
  results_df["rank_by_ip"] = results_df["ip"].rank(ascending=False)
218
+ results_df["rank_by_gr"] = results_df["gr"].rank(ascending=False)
219
 
220
  # make ranks integers
221
  results_df["rank_by_sc"] = results_df["rank_by_sc"].astype(int)
222
  results_df["rank_by_sp"] = results_df["rank_by_sp"].astype(int)
223
  results_df["rank_by_ip"] = results_df["rank_by_ip"].astype(int)
224
+ results_df["rank_by_gr"] = results_df["rank_by_gr"].astype(int)
225
 
226
+ # add rank_by_sc_sp, rank_by_sc_ip, and rank_by_sc_gr by adding the ranks of sc and sp/ip/gr
 
227
  results_df["rank_by_scsp"] = results_df["rank_by_sc"] + results_df["rank_by_sp"]
228
  results_df["rank_by_scip"] = results_df["rank_by_sc"] + results_df["rank_by_ip"]
229
+ results_df["rank_by_scgr"] = results_df["rank_by_sc"] + results_df["rank_by_gr"]
230
 
231
+ # Drop the sequence column
 
 
 
 
 
232
  results_df = results_df.drop(columns=["sequence"])
 
 
233
 
234
+ # Apply rounding
235
+ # results_df = results_df.applymap(lambda x: round(x, 6) if isinstance(x, (int, float)) else x)
236
+
237
+ # By default sort by rank_by_sc_gr
238
+ results_df = results_df.sort_values(by="rank_by_scgr")
239
+ return results_df
240
 
241
 
242
  def process_target_data(average_embedding, target_data):
243
  sc_scores, sp_scores, ip_scores = get_sc_sp_ip(average_embedding, target_data)
244
+ results_df = get_results_dict(target_data, sc_scores, sp_scores, ip_scores)
245
+ return results_df
 
 
 
 
reference_delta.csv ADDED
The diff for this file is too large to render. See raw diff
 
requirements.txt CHANGED
@@ -6,6 +6,7 @@ tokenizers==0.13.3
6
  scanpy==1.9.3
7
  scikit-learn==1.2.2
8
  scipy==1.10.1
 
9
  torch-optimizer==0.3.0
10
  torchmetrics==0.9.0
11
  torch==1.12.1+cu113
 
6
  scanpy==1.9.3
7
  scikit-learn==1.2.2
8
  scipy==1.10.1
9
+ plotly==5.24.1
10
  torch-optimizer==0.3.0
11
  torchmetrics==0.9.0
12
  torch==1.12.1+cu113
response.json DELETED
@@ -1,132 +0,0 @@
1
- {
2
- "EPI_ISL_18905639": {
3
- "sc": 410.788391,
4
- "sp": 0.000113,
5
- "ip": 8.6e-05,
6
- "log10(sc)": 2.613618,
7
- "log10(sp)": -3.946409,
8
- "log10(ip)": -4.06565,
9
- "rank_by_sc": 2,
10
- "rank_by_sp": 3,
11
- "rank_by_ip": 1,
12
- "rank_by_scsp": 5,
13
- "rank_by_scip": 3
14
- },
15
- "EPI_ISL_18905700": {
16
- "sc": 410.80368,
17
- "sp": 8.8e-05,
18
- "ip": 8.2e-05,
19
- "log10(sc)": 2.613634,
20
- "log10(sp)": -4.055418,
21
- "log10(ip)": -4.08394,
22
- "rank_by_sc": 1,
23
- "rank_by_sp": 8,
24
- "rank_by_ip": 7,
25
- "rank_by_scsp": 9,
26
- "rank_by_scip": 8
27
- },
28
- "EPI_ISL_18905699": {
29
- "sc": 410.783234,
30
- "sp": 0.0001,
31
- "ip": 8.5e-05,
32
- "log10(sc)": 2.613613,
33
- "log10(sp)": -3.998215,
34
- "log10(ip)": -4.073006,
35
- "rank_by_sc": 4,
36
- "rank_by_sp": 5,
37
- "rank_by_ip": 2,
38
- "rank_by_scsp": 9,
39
- "rank_by_scip": 6
40
- },
41
- "EPI_ISL_18905723": {
42
- "sc": 410.788208,
43
- "sp": 9.4e-05,
44
- "ip": 8.4e-05,
45
- "log10(sc)": 2.613618,
46
- "log10(sp)": -4.025329,
47
- "log10(ip)": -4.076503,
48
- "rank_by_sc": 3,
49
- "rank_by_sp": 7,
50
- "rank_by_ip": 3,
51
- "rank_by_scsp": 10,
52
- "rank_by_scip": 6
53
- },
54
- "EPI_ISL_18905548": {
55
- "sc": 402.632538,
56
- "sp": 0.0001,
57
- "ip": 8.3e-05,
58
- "log10(sc)": 2.604909,
59
- "log10(sp)": -3.999283,
60
- "log10(ip)": -4.082134,
61
- "rank_by_sc": 5,
62
- "rank_by_sp": 6,
63
- "rank_by_ip": 6,
64
- "rank_by_scsp": 11,
65
- "rank_by_scip": 11
66
- },
67
- "EPI_ISL_17742626": {
68
- "sc": 4.151595,
69
- "sp": 0.000121,
70
- "ip": 8.3e-05,
71
- "log10(sc)": 0.618215,
72
- "log10(sp)": -3.916465,
73
- "log10(ip)": -4.082081,
74
- "rank_by_sc": 9,
75
- "rank_by_sp": 2,
76
- "rank_by_ip": 5,
77
- "rank_by_scsp": 11,
78
- "rank_by_scip": 14
79
- },
80
- "EPI_ISL_17742616": {
81
- "sc": 4.14923,
82
- "sp": 0.000128,
83
- "ip": 8.3e-05,
84
- "log10(sc)": 0.617968,
85
- "log10(sp)": -3.891859,
86
- "log10(ip)": -4.079061,
87
- "rank_by_sc": 10,
88
- "rank_by_sp": 1,
89
- "rank_by_ip": 4,
90
- "rank_by_scsp": 11,
91
- "rank_by_scip": 14
92
- },
93
- "EPI_ISL_17781217": {
94
- "sc": 4.164279,
95
- "sp": 0.000109,
96
- "ip": 8.2e-05,
97
- "log10(sc)": 0.61954,
98
- "log10(sp)": -3.964267,
99
- "log10(ip)": -4.084112,
100
- "rank_by_sc": 8,
101
- "rank_by_sp": 4,
102
- "rank_by_ip": 8,
103
- "rank_by_scsp": 12,
104
- "rank_by_scip": 16
105
- },
106
- "EPI_ISL_17741975": {
107
- "sc": 4.177281,
108
- "sp": 8.3e-05,
109
- "ip": 8e-05,
110
- "log10(sc)": 0.620894,
111
- "log10(sp)": -4.079542,
112
- "log10(ip)": -4.096401,
113
- "rank_by_sc": 6,
114
- "rank_by_sp": 9,
115
- "rank_by_ip": 10,
116
- "rank_by_scsp": 15,
117
- "rank_by_scip": 16
118
- },
119
- "EPI_ISL_17742593": {
120
- "sc": 4.170872,
121
- "sp": 7.1e-05,
122
- "ip": 8.2e-05,
123
- "log10(sc)": 0.620227,
124
- "log10(sp)": -4.150273,
125
- "log10(ip)": -4.087243,
126
- "rank_by_sc": 7,
127
- "rank_by_sp": 10,
128
- "rank_by_ip": 9,
129
- "rank_by_scsp": 17,
130
- "rank_by_scip": 16
131
- }
132
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
target.csv CHANGED
@@ -1,11 +1,21 @@
1
  accession_id,sequence
2
  EPI_ISL_18905639,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVXYXKNNKSWMESEXRVYSSANNCTFEYVSQPFLMDLEGKXGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKPSGNYNYRYRLLRKSKLKPFERDISTEIYQAGNKPCNGVAGPNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
3
  EPI_ISL_18905723,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDXXYQKNNKSWMESELRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKPSGNYDYRYRLLRKSKLKPFERDISTEIYQAGNKPCNGVAGPNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
4
- EPI_ISL_18905700,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVXYXKNNKSWMESELRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKPSGNYNYRYRLLRKSKLKPFERDISTEIYQAGNKPCNGVAGPNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDVKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
5
  EPI_ISL_18905548,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVXYQKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKPSGNYNYLYRFXRKSKLKPFERDISTEIYQAGNKPCNGVAGPNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENLVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
6
  EPI_ISL_18905699,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVXYQKNNKSWMESELRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKPSGNYDYRYRLLRKSKLKPFERDISTEIYQAGNKPCNGVAGPNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
7
- EPI_ISL_17742626,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQSNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATRFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGFNCYFPLRSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
 
 
 
 
8
  EPI_ISL_17781217,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAISGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATTFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVGGNYNYRYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGVNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
9
  EPI_ISL_17741975,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHNNNKSWTESEFRVYSSAKNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALNRSYLTPGDSSSDWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATTFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSRVSGNYDYMYRLFRKSKLKPFERDISTEIYQAGNKPCNGVRGSNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
10
  EPI_ISL_17742593,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATRFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGFNCYFPLRSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
11
- EPI_ISL_17742616,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATRFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGFNCYFPLRSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTHTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKLALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELHSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
 
 
 
 
 
 
 
1
  accession_id,sequence
2
  EPI_ISL_18905639,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVXYXKNNKSWMESEXRVYSSANNCTFEYVSQPFLMDLEGKXGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKPSGNYNYRYRLLRKSKLKPFERDISTEIYQAGNKPCNGVAGPNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
3
  EPI_ISL_18905723,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDXXYQKNNKSWMESELRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKPSGNYDYRYRLLRKSKLKPFERDISTEIYQAGNKPCNGVAGPNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
4
+ EPI_ISL_18986232,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFXXXXXFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNKLDSKXXGNYNYLYRLLRKSKLKPFERDISTEIYQAGNKPCNGVXXXNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
5
  EPI_ISL_18905548,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVXYQKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKPSGNYNYLYRFXRKSKLKPFERDISTEIYQAGNKPCNGVAGPNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENLVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
6
  EPI_ISL_18905699,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVXYQKNNKSWMESELRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKPSGNYDYRYRLLRKSKLKPFERDISTEIYQAGNKPCNGVAGPNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
7
+ EPI_ISL_18986236,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYQKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAXFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNKLDSKXXGNYNYLYRFLRKSKLKPFERDISTEIYQAGNKPCNGVXXXNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
8
+ EPI_ISL_18986233,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYQKNNKSWMESELRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAXFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNKLDSKXXGNYNYLYRLFRKSKLKPFERDISTEIYQAGNRPCNGXXXXNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
9
+ EPI_ISL_18986234,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAXFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNKLDSKXXGNYNYLYRLLRKSKLKPFERDISTEIYQAGNKPCNGVXXXNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
10
+ EPI_ISL_18986282,MFVFLVLLPLVSSQCVNLITRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESELRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFXXXFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNKLDSKXXGNYNYRYRLLRKSKLKPFERDISTEIYQAGNKPCNGXXXXNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
11
+ EPI_ISL_18986226,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTHDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPALPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKEGNFKNLREFVFKNIDGYFKIYSKHTPINLERDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPVDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFHEVFNATTFASVYAWNRKRISNCVADYSVIYNFAXFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGKIADYNYKLPDDFTGCVIAWNSNKLDSKXXGNYNYLYRLLRKSKLKPFERDISTEIYQAGNKPCNGVXXXNCYSPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
12
  EPI_ISL_17781217,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAISGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATTFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVGGNYNYRYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGVNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
13
  EPI_ISL_17741975,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHNNNKSWTESEFRVYSSAKNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALNRSYLTPGDSSSDWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATTFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSRVSGNYDYMYRLFRKSKLKPFERDISTEIYQAGNKPCNGVRGSNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
14
  EPI_ISL_17742593,MFVFLVLLPLVSSQCVNLITRTQSYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLGRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFDEVFNATRFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGFNCYFPLRSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
15
+ EPI_ISL_17793453,MFVFLVLLPLVSSQCVNLITRTQLSPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWVFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYHKNNKSWMESGVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTLINSVRHLPQGFSVLEPLVDLPIGINITRFQTLLHRSYLTPGDSSLGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATSFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVSGNYNYRYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGPNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAEISVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
16
+ EPI_ISL_17793481,MFVFLVLLPLVSSQCVNLITRTQLSPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYHKNNKSWMESGVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTLINSVRHLPQGFSVLEPLVDLPIGINITRFQTLLHRSYLTPGDSSLGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATSFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVSGNYNYRYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGPNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGXXXXXXXQTKSHRRARSVASQSIIAYTMSLGAEISVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
17
+ EPI_ISL_17793474,MFVFLVLLPLVSSQCVNLITRTQLSPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYHKNNKSWMESGVYSSANNCTFEYVSQPFLMDLEGKQVNFKNLREFVFKNIDGYFKIYSKHTLINSVRHLPQGFSVLEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSSWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATSFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVSGNYNYRYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGPNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYXXQTKSHRRARSVASQSIIAYTMSLGAEISVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
18
+ EPI_ISL_17793872,MFVFLVLLPLVSSQCVNLITRTQLSPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYHKNNKSWMESGVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTLINSVRHLPQGFSVLEPXXXXXXXXXITRFQTLLHRSYLTPGDSSLGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEXXIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATSFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVSGNYNYRYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGPNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAEISVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLFSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
19
+ EPI_ISL_17793635,MFVFLVLLPLVSSQCVNLITRTQLSPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYHKNNKSWMESGVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTLINSVRHLPQGFSVLEPLVDLPIGINITRFQTLLHRSYLTPGDSSLGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFIGCVIAWNSNKLDSKVSGNYNYMYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGPNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAEISVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
20
+ EPI_ISL_17793691,MFVFLVLLPLVSSQCVNLITRTQLSPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYHKNNKSWMESGVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTLINSVRHLPQGFSVLEPLVDLPIGINITRFQTLLHRSYLTPGDSSLGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATSFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKVSGNYNYRYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGPNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAEISVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPALEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT
21
+ EPI_ISL_17793695,MFVFLVLLPLVSSQCVNLITRTQLSPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLDVYHKNNKSWMESGVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTLINSVRHLPQGFSVLEPLVDLPIGINITRFQTLLHRSYLTPGDSSLGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATSFASVYAWNRKRISNCVADYSVLYNFAPFFAFKCYGVSPTKLNDLCFTNVYADSFVIRGNEVSQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNKLDSKASGNYNYRYRLFRKSNLKPFERDISTEIYQAGNKPCNGVAGPNCYFPLQSYGFRPTYGVGHQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEYVNNSYECDIPIGAGICASYQTQTKSHRRARSVASQSIIAYTMSLGAEISVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLKRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKYFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNHNAQALNTLVKQLSSKFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT