Sheshera Mysore commited on
Commit
ffb37ac
·
1 Parent(s): 372e9a0

Change to allow profile selection from dropdown instead of json upload.

Browse files
app.py CHANGED
@@ -386,12 +386,19 @@ if 'tuning_i' not in st.session_state:
386
 
387
  # Ask user to upload a set of seed query papers.
388
  with st.sidebar:
389
- uploaded_file = st.file_uploader("\U0001F331 Upload seed papers",
390
- type='json',
391
- help='Upload a json file with titles and abstracts of the papers to '
392
- 'include in your profile.')
393
- if uploaded_file is not None:
394
- user_papers = json.load(uploaded_file)
 
 
 
 
 
 
 
395
  # Read user data.
396
  doc_vectors_user, pid2idx_user, pid2sent_vectors_user, user_kps = read_user(user_papers)
397
  st.session_state.run_user_kps.append(copy.copy(user_kps))
@@ -408,7 +415,7 @@ with st.sidebar:
408
 
409
  st.markdown('\u2b50 Saved papers')
410
 
411
- if uploaded_file is not None:
412
  # Create a text box where users can see their profile keyphrases.
413
  st.subheader('\U0001F4DD Seed paper descriptors')
414
  with st.form('profile_kps'):
 
386
 
387
  # Ask user to upload a set of seed query papers.
388
  with st.sidebar:
389
+ available_users = os.listdir(os.path.join(in_path, 'users'))
390
+ available_users.sort()
391
+ available_users = (None,) + tuple(available_users)
392
+ # uploaded_file = st.file_uploader("\U0001F331 Upload seed papers",
393
+ # type='json',
394
+ # help='Upload a json file with titles and abstracts of the papers to '
395
+ # 'include in your profile.')
396
+ selected_user = st.selectbox('Select your username from the drop-down',
397
+ available_users)
398
+ if selected_user is not None:
399
+ user_papers = json.load(
400
+ open(os.path.join(in_path, 'users', selected_user, f'seedset-{selected_user}-maple.json')))
401
+ # user_papers = json.load(uploaded_file)
402
  # Read user data.
403
  doc_vectors_user, pid2idx_user, pid2sent_vectors_user, user_kps = read_user(user_papers)
404
  st.session_state.run_user_kps.append(copy.copy(user_kps))
 
415
 
416
  st.markdown('\u2b50 Saved papers')
417
 
418
+ if selected_user is not None:
419
  # Create a text box where users can see their profile keyphrases.
420
  st.subheader('\U0001F4DD Seed paper descriptors')
421
  with st.form('profile_kps'):
data/users/agloberson/seedset-agloberson-maple.json ADDED
@@ -0,0 +1,294 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "agloberson",
3
+ "s2_authorid": "1786843",
4
+ "papers": [
5
+ {
6
+ "title": "Dissecting Recall of Factual Associations in Auto-Regressive Language Models",
7
+ "abstract": [
8
+ "Transformer-based language models (LMs) are known to capture factual knowledge in their parameters.",
9
+ "While previous work looked into where factual associations are stored, only little is known about how they are retrieved internally during inference.",
10
+ "We investigate this question through the lens of information flow.",
11
+ "Given a subject-relation query, we study how the model aggregates information about the subject and relation to predict the correct attribute.",
12
+ "With interventions on attention edges, we first identify two critical points where information propagates to the prediction: one from the relation positions followed by another from the subject positions.",
13
+ "Next, by analyzing the information at these points, we unveil a three-step internal mechanism for attribute extraction.",
14
+ "First, the representation at the last-subject position goes through an enrichment process, driven by the early MLP sublayers, to encode many subject-related attributes.",
15
+ "Second, information from the relation propagates to the prediction.",
16
+ "Third, the prediction representation\"queries\"the enriched subject to extract the attribute.",
17
+ "Perhaps surprisingly, this extraction is typically done via attention heads, which often encode subject-attribute mappings in their parameters.",
18
+ "Overall, our findings introduce a comprehensive view of how factual associations are stored and extracted internally in LMs, facilitating future research on knowledge localization and editing."
19
+ ]
20
+ },
21
+ {
22
+ "title": "Evaluating the Ripple Effects of Knowledge Editing in Language Models",
23
+ "abstract": [
24
+ "Modern language models capture a large body of factual knowledge.",
25
+ "However, some facts can be incorrectly induced or become obsolete over time, resulting in factually incorrect generations.",
26
+ "This has led to the development of various editing methods that allow updating facts encoded by the model.",
27
+ "Evaluation of these methods has primarily focused on testing whether an individual fact has been successfully injected, and if similar predictions for other subjects have not changed.",
28
+ "Here we argue that such evaluation is limited, since injecting one fact (e.g. ``Jack Depp is the son of Johnny Depp'') introduces a ``ripple effect'' in the form of additional facts that the model needs to update (e.g.``Jack Depp is the sibling of Lily-Rose Depp'').",
29
+ "To address this issue, we propose a novel set of evaluation criteria that consider the implications of an edit on related facts.",
30
+ "Using these criteria, we then construct RippleEdits, a diagnostic benchmark of 5K factual edits, capturing a variety of types of ripple effects.",
31
+ "We evaluate prominent editing methods on RippleEdits, showing that current methods fail to introduce consistent changes in the model's knowledge.",
32
+ "In addition, we find that a simple in-context editing baseline obtains the best scores on our benchmark, suggesting a promising research direction for model editing."
33
+ ]
34
+ },
35
+ {
36
+ "title": "Crawling The Internal Knowledge-Base of Language Models",
37
+ "abstract": [
38
+ "Language models are trained on large volumes of text, and as a result their parameters might contain a significant body of factual knowledge.",
39
+ "Any downstream task performed by these models implicitly builds on these facts, and thus it is highly desirable to have means for representing this body of knowledge in an interpretable way.",
40
+ "However, there is currently no mechanism for such a representation.",
41
+ "Here, we propose to address this goal by extracting a knowledge-graph of facts from a given language model.",
42
+ "We describe a procedure for \u201ccrawling\u201d the internal knowledge-base of a language model.",
43
+ "Specifically, given a seed entity, we expand a knowledge-graph around it.",
44
+ "The crawling procedure is decomposed into sub-tasks, realized through specially designed prompts that control for both precision (i.e., that no wrong facts are generated) and recall (i.e., the number of facts generated).",
45
+ "We evaluate our approach on graphs crawled starting from dozens of seed entities, and show it yields high precision graphs (82-92%), while emitting a reasonable number of facts per entity."
46
+ ]
47
+ },
48
+ {
49
+ "title": "Predicting masked tokens in stochastic locations improves masked image modeling",
50
+ "abstract": [
51
+ "Self-supervised learning is a promising paradigm in deep learning that enables learning from unlabeled data by constructing pretext tasks that require learning useful representations.",
52
+ "In natural language processing, the dominant pretext task has been masked language modeling (MLM), while in computer vision there exists an equivalent called Masked Image Modeling (MIM).",
53
+ "However, MIM is challenging because it requires predicting semantic content in accurate locations.",
54
+ "E.g, given an incomplete picture of a dog, we can guess that there is a tail, but we cannot determine its exact location.",
55
+ "In this work, we propose FlexPredict, a stochastic model that addresses this challenge by incorporating location uncertainty into the model.",
56
+ "Specifically, we condition the model on stochastic masked token positions to guide the model toward learning features that are more robust to location uncertainties.",
57
+ "Our approach improves downstream performance on a range of tasks, e.g, compared to MIM baselines, FlexPredict boosts ImageNet linear probing by 1.6% with ViT-B and by 2.5% for semi-supervised video segmentation using ViT-L."
58
+ ]
59
+ },
60
+ {
61
+ "title": "Covering Uncommon Ground: Gap-Focused Question Generation for Answer Assessment",
62
+ "abstract": [
63
+ "Human communication often involves information gaps between the interlocutors.",
64
+ "For example, in an educational dialogue a student often provides an answer that is incomplete, and there is a gap between this answer and the perfect one expected by the teacher.",
65
+ "Successful dialogue then hinges on the teacher asking about this gap in an effective manner, thus creating a rich and interactive educational experience.",
66
+ "We focus on the problem of generating such gap-focused questions (GFQs) automatically.",
67
+ "We define the task, highlight key desired aspects of a good GFQ, and propose a model that satisfies these.",
68
+ "Finally, we provide an evaluation by human annotators of our generated questions compared against human generated ones, demonstrating competitive performance."
69
+ ]
70
+ },
71
+ {
72
+ "title": "Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs",
73
+ "abstract": [
74
+ "Vision and language models (VLMs) have demonstrated remarkable zero-shot (ZS) performance in a variety of tasks.",
75
+ "However, recent works have shown that even the best VLMs struggle to capture aspects of compositional scene understanding, such as object attributes, relations, and action states.",
76
+ "In contrast, obtaining structured annotations, such as scene graphs (SGs), that could improve these models is time-consuming and costly, and thus cannot be used on a large scale.",
77
+ "Here we ask whether small SG datasets can provide sufficient information for enhancing structured understanding of pretrained VLMs.",
78
+ "We show that it is indeed possible to improve VLMs when learning from SGs by integrating components that incorporate structured information into both visual and textual representations.",
79
+ "For the visual side, we incorporate a special\"SG Component\"in the image transformer trained to predict SG information, while for the textual side, we utilize SGs to generate fine-grained captions that highlight different compositional aspects of the scene.",
80
+ "Our method improves the performance of several popular VLMs on multiple VL datasets with only a mild degradation in ZS capabilities."
81
+ ]
82
+ },
83
+ {
84
+ "title": "LM vs LM: Detecting Factual Errors via Cross Examination",
85
+ "abstract": [
86
+ "A prominent weakness of modern language models (LMs) is their tendency to generate factually incorrect text, which hinders their usability.",
87
+ "A natural question is whether such factual errors can be detected automatically.",
88
+ "Inspired by truth-seeking mechanisms in law, we propose a factuality evaluation framework for LMs that is based on cross-examination.",
89
+ "Our key idea is that an incorrect claim is likely to result in inconsistency with other claims that the model generates.",
90
+ "To discover such inconsistencies, we facilitate a multi-turn interaction between the LM that generated the claim and another LM (acting as an examiner) which introduces questions to discover inconsistencies.",
91
+ "We empirically evaluate our method on factual claims made by multiple recent LMs on four benchmarks, finding that it outperforms existing methods and baselines, often by a large gap.",
92
+ "Our results demonstrate the potential of using interacting LMs for capturing factual errors."
93
+ ]
94
+ },
95
+ {
96
+ "title": "Visual Prompting via Image Inpainting",
97
+ "abstract": [
98
+ "How does one adapt a pre-trained visual model to novel downstream tasks without task-specific finetuning or any model modification?",
99
+ "Inspired by prompting in NLP, this paper investigates visual prompting: given input-output image example(s) of a new task at test time and a new input image, the goal is to automatically produce the output image, consistent with the given examples.",
100
+ "We show that posing this problem as simple image inpainting - literally just filling in a hole in a concatenated visual prompt image - turns out to be surprisingly effective, provided that the inpainting algorithm has been trained on the right data.",
101
+ "We train masked auto-encoders on a new dataset that we curated - 88k unlabeled figures from academic papers sources on Arxiv.",
102
+ "We apply visual prompting to these pretrained models and demonstrate results on various downstream image-to-image tasks, including foreground segmentation, single object detection, colorization, edge detection, etc."
103
+ ]
104
+ },
105
+ {
106
+ "title": "On the Implicit Bias of Gradient Descent for Temporal Extrapolation",
107
+ "abstract": [
108
+ "When using recurrent neural networks (RNNs) it is common practice to apply trained models to sequences longer than those seen in training.",
109
+ "This\"extrapolating\"usage deviates from the traditional statistical learning setup where guarantees are provided under the assumption that train and test distributions are identical.",
110
+ "Here we set out to understand when RNNs can extrapolate, focusing on a simple case where the data generating distribution is memoryless.",
111
+ "We first show that even with infinite training data, there exist RNN models that interpolate perfectly (i.e., they fit the training data) yet extrapolate poorly to longer sequences.",
112
+ "We then show that if gradient descent is used for training, learning will converge to perfect extrapolation under certain assumptions on initialization.",
113
+ "Our results complement recent studies on the implicit bias of gradient descent, showing that it plays a key role in extrapolation when learning temporal prediction models."
114
+ ]
115
+ },
116
+ {
117
+ "title": "PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers using Synthetic Scene Data",
118
+ "abstract": [
119
+ "Action recognition models have achieved impressive results by incorporating scene-level annotations, such as objects, their relations, 3D structure, and more.",
120
+ "However, obtaining annotations of scene structure for videos requires a significant amount of effort to gather and annotate, making these methods expensive to train.",
121
+ "In contrast, synthetic datasets generated by graphics engines provide powerful alternatives for generating scene-level annotations across multiple tasks.",
122
+ "In this work, we propose an approach to leverage synthetic scene data for improving video understanding.",
123
+ "We present a multi-task prompt learning approach for video transformers, where a shared video transformer backbone is enhanced by a small set of specialized parameters for each task.",
124
+ "Specifically, we add a set of\"task prompts\", each corresponding to a different task, and let each prompt predict task-related annotations.",
125
+ "This design allows the model to capture information shared among synthetic scene tasks as well as information shared between synthetic scene tasks and a real video downstream task throughout the entire network.",
126
+ "We refer to this approach as\"Promptonomy\", since the prompts model task-related structure.",
127
+ "We propose the PromptonomyViT model (PViT), a video transformer that incorporates various types of scene-level information from synthetic data using the\"Promptonomy\"approach.",
128
+ "PViT shows strong performance improvements on multiple video understanding tasks and datasets.",
129
+ "Project page: \\url{https://ofir1080.github.io/PromptonomyViT}"
130
+ ]
131
+ },
132
+ {
133
+ "title": "Active Learning with Label Comparisons",
134
+ "abstract": [
135
+ "Supervised learning typically relies on manual annotation of the true labels.",
136
+ "When there are many potential classes, searching for the best one can be prohibitive for a human annotator.",
137
+ "On the other hand, comparing two candidate labels is often much easier.",
138
+ "We focus on this type of pairwise supervision and ask how it can be used effectively in learning, and in particular in active learning.",
139
+ "We obtain several insightful results in this context.",
140
+ "In principle, finding the best of $k$ labels can be done with $k-1$ active queries.",
141
+ "We show that there is a natural class where this approach is sub-optimal, and that there is a more comparison-efficient active learning scheme.",
142
+ "A key element in our analysis is the\"label neighborhood graph\"of the true distribution, which has an edge between two classes if they share a decision boundary.",
143
+ "We also show that in the PAC setting, pairwise comparisons cannot provide improved sample complexity in the worst case.",
144
+ "We complement our theoretical results with experiments, clearly demonstrating the effect of the neighborhood graph on sample complexity."
145
+ ]
146
+ },
147
+ {
148
+ "title": "Text-Only Training for Image Captioning using Noise-Injected CLIP",
149
+ "abstract": [
150
+ "We consider the task of image-captioning using only the CLIP model and additional text data at training time, and no additional captioned images.",
151
+ "Our approach relies on the fact that CLIP is trained to make visual and textual embeddings similar.",
152
+ "Therefore, we only need to learn how to translate CLIP textual embeddings back into text, and we can learn how to do this by learning a decoder for the frozen CLIP text encoder using only text.",
153
+ "We argue that this intuition is\"almost correct\"because of a gap between the embedding spaces, and propose to rectify this via noise injection during training.",
154
+ "We demonstrate the effectiveness of our approach by showing SOTA zero-shot image captioning across four benchmarks, including style transfer.",
155
+ "Code, data, and models are available on GitHub."
156
+ ]
157
+ },
158
+ {
159
+ "title": "Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens",
160
+ "abstract": [
161
+ "Recent action recognition models have achieved impressive results by integrating objects, their locations and interactions.",
162
+ "However, obtaining dense structured annotations for each frame is tedious and time-consuming, making these methods expensive to train and less scalable.",
163
+ "At the same time, if a small set of annotated images is available, either within or outside the domain of interest, how could we leverage these for a video downstream task?",
164
+ "We propose a learning framework StructureViT (SViT for short), which demonstrates how utilizing the structure of a small number of images only available during training can improve a video model.",
165
+ "SViT relies on two key insights.",
166
+ "First, as both images and videos contain structured information, we enrich a transformer model with a set of object tokens that can be used across images and videos.",
167
+ "Second, the scene representations of individual frames in video should \u201calign\u201d with those of still images.",
168
+ "This is achieved via a Frame-Clip Consistency loss, which ensures the \ufb02ow of structured information between images and videos.",
169
+ "We explore a particular instantiation of scene structure, namely a Hand-Object Graph , consisting of hands and objects with their locations as nodes, and physical relations of contact/no-contact as edges.",
170
+ "SViT shows strong performance improvements on multiple video understanding tasks and datasets.",
171
+ "Furthermore, it won in the Ego4D CVPR\u201922 Object State Localization challenge.",
172
+ "For code and pretrained models, visit the project page at https://eladb3.github.io/SViT/"
173
+ ]
174
+ },
175
+ {
176
+ "title": "What Are You Token About? Dense Retrieval as Distributions Over the Vocabulary",
177
+ "abstract": [
178
+ "Dual encoders are now the dominant architecture for dense retrieval.",
179
+ "Yet, we have little understanding of how they represent text, and why this leads to good performance.",
180
+ "In this work, we shed light on this question via distributions over the vocabulary.",
181
+ "We propose to interpret the vector representations produced by dual encoders by projecting them into the model\u2019s vocabulary space.",
182
+ "We show that the resulting projections contain rich semantic information, and draw connection between them and sparse retrieval.",
183
+ "We find that this view can offer an explanation for some of the failure cases of dense retrievers.",
184
+ "For example, we observe that the inability of models to handle tail entities is correlated with a tendency of the token distributions to forget some of the tokens of those entities.",
185
+ "We leverage this insight and propose a simple way to enrich query and passage representations with lexical information at inference time, and show that this significantly improves performance compared to the original model in zero-shot settings, and specifically on the BEIR benchmark."
186
+ ]
187
+ },
188
+ {
189
+ "title": "Graph Trees with Attention",
190
+ "abstract": [
191
+ "When dealing with tabular data, models based on regression and decision trees are a popular choice due to the high accuracy they provide on such tasks and their ease of application as compared to other model classes.",
192
+ "Yet, when it comes to graph-structure data, current tree learning algorithms do not provide tools to manage the structure of the data other than relying on feature engineering.",
193
+ "In this work we address the above gap, and introduce Graph Trees with Attention (GTA), a new family of tree-based learning algorithms that are designed to operate on graphs.",
194
+ "GTA leverages both the graph structure and the features at the vertices and employs an attention mechanism that allows decisions to concentrate on sub-structures of the graph.",
195
+ "We analyze GTA models and show that they are strictly more expressive than plain decision trees.",
196
+ "We also demonstrate the bene\ufb01ts of GTA empirically on multiple graph and node prediction benchmarks.",
197
+ "In these experiments, GTA always outperformed other tree-based models and often outperformed other types of graph-learning algorithms such as Graph Neural Networks (GNNs) and Graph Kernels.",
198
+ "Finally, we also provide an explainability mechanism for GTA, and demonstrate it can provide intuitive explanations."
199
+ ]
200
+ },
201
+ {
202
+ "title": "TREE-G: Decision Trees Contesting Graph Neural Networks",
203
+ "abstract": [
204
+ "When dealing with tabular data, models based on decision trees are a popular choice due to their high accuracy on these data types, their ease of application, and explainability properties.",
205
+ "However, when it comes to graph-structured data, it is not clear how to apply them effectively, in a way that incorporates the topological information with the tabular data available on the vertices of the graph.",
206
+ "To address this challenge, we introduce TREE-G. TREE-G modifies standard decision trees, by introducing a novel split function that is specialized for graph data.",
207
+ "Not only does this split function incorporate the node features and the topological information, but it also uses a novel pointer mechanism that allows split nodes to use information computed in previous splits.",
208
+ "Therefore, the split function adapts to the predictive task and the graph at hand.",
209
+ "We analyze the theoretical properties of TREE-G and demonstrate its benefits empirically on multiple graph and vertex prediction benchmarks.",
210
+ "In these experiments, TREE-G consistently outperforms other tree-based models and often outperforms other graph-learning algorithms such as Graph Neural Networks (GNNs) and Graph Kernels, sometimes by large margins.",
211
+ "Moreover, TREE-Gs models and their predictions can be explained and visualized"
212
+ ]
213
+ },
214
+ {
215
+ "title": "Efficient Learning of CNNs using Patch Based Features",
216
+ "abstract": [
217
+ "Recent work has demonstrated the effectiveness of using patch based representations when learning from image data.",
218
+ "Here we provide theoretical support for this observation, by showing that a simple semi-supervised algorithm that uses patch statistics can ef\ufb01ciently learn labels produced by a one-hidden-layer Convolutional Neural Network (CNN).",
219
+ "Since CNNs are known to be computationally hard to learn in the worst case, our analysis holds under some distributional assumptions.",
220
+ "We show that these assumptions are necessary and suf\ufb01cient for our results to hold.",
221
+ "We verify that the distributional assumptions hold on real-world data by experimenting on the CIFAR-10 dataset, and \ufb01nd that the analyzed algorithm outperforms a vanilla one-hidden-layer CNN.",
222
+ "Finally, we demonstrate that by running the algorithm in a layer-by-layer fashion we can build a deep model which gives further improvements, hinting that this method provides insights about the behavior of deep CNNs."
223
+ ]
224
+ },
225
+ {
226
+ "title": "Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets",
227
+ "abstract": [
228
+ "Overparameterization in deep learning typically refers to settings where a trained neural network (NN) has representational capacity to fit the training data in many ways, some of which generalize well, while others do not.",
229
+ "In the case of Recurrent Neural Networks (RNNs), there exists an additional layer of overparameterization, in the sense that a model may exhibit many solutions that generalize well for sequence lengths seen in training, some of which extrapolate to longer sequences, while others do not.",
230
+ "Numerous works have studied the tendency of Gradient Descent (GD) to fit overparameterized NNs with solutions that generalize well.",
231
+ "On the other hand, its tendency to fit overparameterized RNNs with solutions that extrapolate has been discovered only recently and is far less understood.",
232
+ "In this paper, we analyze the extrapolation properties of GD when applied to overparameterized linear RNNs.",
233
+ "In contrast to recent arguments suggesting an implicit bias towards short-term memory, we provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.",
234
+ "Our result relies on a dynamical characterization which shows that GD (with small step size and near-zero initialization) strives to maintain a certain form of balancedness, as well as on tools developed in the context of the moment problem from statistics (recovery of a probability distribution from its moments).",
235
+ "Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs."
236
+ ]
237
+ },
238
+ {
239
+ "title": "On the inductive bias of neural networks for learning read-once DNFs",
240
+ "abstract": [
241
+ "Learning functions over Boolean variables is a fun-damental problem in machine learning.",
242
+ "But not much is known about learning such functions using neural networks.",
243
+ "Here we focus on learning read-once disjunctive normal forms (DNFs) under the uniform distribution with a convex neural network and gradient methods.",
244
+ "We \ufb01rst observe empirically that gradient methods converge to compact solutions with neurons that are aligned with the terms of the DNF.",
245
+ "This is despite the fact that there are many zero training error networks that do not have this property.",
246
+ "Thus, the learning pro-cess has a clear inductive bias towards such logical formulas.",
247
+ "Following recent results which con-nect the inductive bias of gradient \ufb02ow (GF) to Karush-Kuhn-Tucker (KKT) points of minimum norm problems, we study these KKT points in our setting.",
248
+ "We prove that zero training error solutions that memorize training points are not KKT points and therefore GF cannot converge to them.",
249
+ "On the other hand, we prove that globally optimal KKT points correspond exactly to networks that are aligned with the DNF terms.",
250
+ "These results suggest a strong connection between the inductive bias of GF and solutions that align with the DNF.",
251
+ "We conclude with extensive experiments which verify our \ufb01ndings."
252
+ ]
253
+ },
254
+ {
255
+ "title": "Bringing Image Scene Structure to Video via Frame-Clip Consistency of Object Tokens",
256
+ "abstract": [
257
+ "Recent action recognition models have achieved impressive results by integrating objects, their locations and interactions.",
258
+ "However, obtaining dense structured annotations for each frame is tedious and time-consuming, making these methods expensive to train and less scalable.",
259
+ "At the same time, if a small set of annotated images is available, either within or outside the domain of interest, how could we leverage these for a video downstream task?",
260
+ "We propose a learning framework StructureViT (SViT for short), which demonstrates how utilizing the structure of a small number of images only available during training can improve a video model.",
261
+ "SViT relies on two key insights.",
262
+ "First, as both images and videos contain structured information, we enrich a transformer model with a set of \\emph{object tokens} that can be used across images and videos.",
263
+ "Second, the scene representations of individual frames in video should\"align\"with those of still images.",
264
+ "This is achieved via a \\emph{Frame-Clip Consistency} loss, which ensures the flow of structured information between images and videos.",
265
+ "We explore a particular instantiation of scene structure, namely a \\emph{Hand-Object Graph}, consisting of hands and objects with their locations as nodes, and physical relations of contact/no-contact as edges.",
266
+ "SViT shows strong performance improvements on multiple video understanding tasks and datasets.",
267
+ "Furthermore, it won in the Ego4D CVPR'22 Object State Localization challenge.",
268
+ "For code and pretrained models, visit the project page at \\url{https://eladb3.github.io/SViT/}"
269
+ ]
270
+ }
271
+ ],
272
+ "user_kps": [
273
+ "action localization",
274
+ "attentional model",
275
+ "automatic question generation",
276
+ "boolean networks",
277
+ "deep cnn features",
278
+ "graph classifiers",
279
+ "masked conditional neural networks",
280
+ "multi-label active learning",
281
+ "neural language model",
282
+ "neural prediction",
283
+ "pre-trained networks",
284
+ "question generation",
285
+ "recurrent layer",
286
+ "recurrent networks",
287
+ "text-based models",
288
+ "textual representations",
289
+ "truth annotations",
290
+ "video representation learning",
291
+ "visual language model",
292
+ "word retrieval"
293
+ ]
294
+ }
data/users/ddowney/seedset-ddowney-maple.json ADDED
@@ -0,0 +1,274 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "ddowney",
3
+ "s2_authorid": "145612610",
4
+ "papers": [
5
+ {
6
+ "title": "The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces",
7
+ "abstract": [
8
+ "Scholarly publications are key to the transfer of knowledge from scholars to others.",
9
+ "However, research papers are information-dense, and as the volume of the scientific literature grows, the need for new technology to support the reading process grows.",
10
+ "In contrast to the process of finding papers, which has been transformed by Internet technology, the experience of reading research papers has changed little in decades.",
11
+ "The PDF format for sharing research papers is widely used due to its portability, but it has significant downsides including: static content, poor accessibility for low-vision readers, and difficulty reading on mobile devices.",
12
+ "This paper explores the question\"Can recent advances in AI and HCI power intelligent, interactive, and accessible reading interfaces -- even for legacy PDFs?\"We describe the Semantic Reader Project, a collaborative effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers.",
13
+ "Through this project, we've developed ten research prototype interfaces and conducted usability studies with more than 300 participants and real-world users showing improved reading experiences for scholars.",
14
+ "We've also released a production reading interface for research papers that will incorporate the best features as they mature.",
15
+ "We structure this paper around challenges scholars and the public face when reading research papers -- Discovery, Efficiency, Comprehension, Synthesis, and Accessibility -- and present an overview of our progress and remaining open challenges."
16
+ ]
17
+ },
18
+ {
19
+ "title": "CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical Context",
20
+ "abstract": [
21
+ "When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work.",
22
+ "However, it can be challenging to prioritize and make sense of the hundreds of citations encountered during literature reviews.",
23
+ "This paper introduces CiteSee, a paper reading tool that leverages a user's publishing, reading, and saving activities to provide personalized visual augmentations and context around citations.",
24
+ "First, CiteSee connects the current paper to familiar contexts by surfacing known citations a user had cited or opened.",
25
+ "Second, CiteSee helps users prioritize their exploration by highlighting relevant but unknown citations based on saving and reading history.",
26
+ "We conducted a lab study that suggests CiteSee is significantly more effective for paper discovery than three baselines.",
27
+ "A field deployment study shows CiteSee helps participants keep track of their explorations and leads to better situational awareness and increased paper discovery via inline citation when conducting real-world literature reviews."
28
+ ]
29
+ },
30
+ {
31
+ "title": "Relatedly: Scaffolding Literature Reviews with Existing Related Work Sections",
32
+ "abstract": [
33
+ "Scholars who want to research a scientific topic must take time to read, extract meaning, and identify connections across many papers.",
34
+ "As scientific literature grows, this becomes increasingly challenging.",
35
+ "Meanwhile, authors summarize prior research in papers' related work sections, though this is scoped to support a single paper.",
36
+ "A formative study found that while reading multiple related work paragraphs helps overview a topic, it is hard to navigate overlapping and diverging references and research foci.",
37
+ "In this work, we design a system, Relatedly, that scaffolds exploring and reading multiple related work paragraphs on a topic, with features including dynamic re-ranking and highlighting to spotlight unexplored dissimilar information, auto-generated descriptive paragraph headings, and low-lighting of redundant information.",
38
+ "From a within-subjects user study (n=15), we found that scholars generate more coherent, insightful, and comprehensive topic outlines using Relatedly compared to a baseline paper list."
39
+ ]
40
+ },
41
+ {
42
+ "title": "The Semantic Scholar Open Data Platform",
43
+ "abstract": [
44
+ "The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field.",
45
+ "Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature.",
46
+ "We combine public and proprietary data sources using state-of-theart techniques for scholarly PDF content extraction and automatic knowledge graph construction to build the Semantic Scholar Academic Graph, the largest open scientific literature graph to-date, with 200M+ papers, 80M+ authors, 550M+ paper-authorship edges, and 2.4B+ citation edges.",
47
+ "The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings.",
48
+ "In this paper, we describe the components of the S2 data processing pipeline and the associated APIs offered by the platform.",
49
+ "We will update this living document to reflect changes as we add new data offerings and improve existing services."
50
+ ]
51
+ },
52
+ {
53
+ "title": "Beyond Summarization: Designing AI Support for Real-World Expository Writing Tasks",
54
+ "abstract": [
55
+ "Large language models have introduced exciting new opportunities and challenges in designing and developing new AI-assisted writing support tools.",
56
+ "Recent work has shown that leveraging this new technology can transform writing in many scenarios such as ideation during creative writing, editing support, and summarization.",
57
+ "However, AI-supported expository writing--including real-world tasks like scholars writing literature reviews or doctors writing progress notes--is relatively understudied.",
58
+ "In this position paper, we argue that developing AI supports for expository writing has unique and exciting research challenges and can lead to high real-world impacts.",
59
+ "We characterize expository writing as evidence-based and knowledge-generating: it contains summaries of external documents as well as new information or knowledge.",
60
+ "It can be seen as the product of authors' sensemaking process over a set of source documents, and the interplay between reading, reflection, and writing opens up new opportunities for designing AI support.",
61
+ "We sketch three components for AI support design and discuss considerations for future research."
62
+ ]
63
+ },
64
+ {
65
+ "title": "FeedLens: Polymorphic Lenses for Personalizing Exploratory Search over Knowledge Graphs",
66
+ "abstract": [
67
+ "The vast scale and open-ended nature of knowledge graphs (KGs) make exploratory search over them cognitively demanding for users.",
68
+ "We introduce a new technique, polymorphic lenses, that improves exploratory search over a KG by obtaining new leverage from the existing preference models that KG-based systems maintain for recommending content.",
69
+ "The approach is based on a simple but powerful observation: in a KG, preference models can be re-targeted to recommend not only entities of a single base entity type (e.g., papers in the scientific literature KG, products in an e-commerce KG), but also all other types (e.g., authors, conferences, institutions; sellers, buyers).",
70
+ "We implement our technique in a novel system, FeedLens, which is built over Semantic Scholar, a production system for navigating the scientific literature KG.",
71
+ "FeedLens reuses the existing preference models on Semantic Scholar\u2014people\u2019s curated research feeds\u2014as lenses for exploratory search.",
72
+ "Semantic Scholar users can curate multiple feeds/lenses for different topics of interest, e.g., one for human-centered AI and another for document embeddings.",
73
+ "Although these lenses are defined in terms of papers, FeedLens re-purposes them to also guide search over authors, institutions, venues, etc.",
74
+ "Our system design is based on feedback from intended users via two pilot surveys (n = 17 and n = 13, respectively).",
75
+ "We compare FeedLens and Semantic Scholar via a third (within-subjects) user study (n = 15) and find that FeedLens increases user engagement while reducing the cognitive effort required to complete a short literature review task.",
76
+ "Our qualitative results also highlight people\u2019s preference for this more effective exploratory search experience enabled by FeedLens."
77
+ ]
78
+ },
79
+ {
80
+ "title": "Don\u2019t Say What You Don\u2019t Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search",
81
+ "abstract": [
82
+ "Abstractive summarization systems today produce fluent and relevant output, but often \u201challucinate\u201d statements not supported by the source text.",
83
+ "We analyze the connection between hallucinations and training data, and find evidence that models hallucinate because they train on target summaries that are unsupported by the source.",
84
+ "Based on our findings, we present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.",
85
+ "Given the model states and outputs at a given step, PINOCCHIO detects likely model hallucinations based on various measures of attribution to the source text.",
86
+ "PINOCCHIO backtracks to find more consistent output, and can opt to produce no summary at all when no consistent generation can be found.",
87
+ "In experiments, we find that PINOCCHIO improves the consistency of generation by an average of 67% on two abstractive summarization datasets, without hurting recall."
88
+ ]
89
+ },
90
+ {
91
+ "title": "ACCoRD: A Multi-Document Approach to Generating Diverse Descriptions of Scientific Concepts",
92
+ "abstract": [
93
+ "Systems that can automatically define unfamiliar terms hold the promise of improving the accessibility of scientific texts, especially for readers who may lack prerequisite background knowledge.",
94
+ "However, current systems assume a single\"best\"description per concept, which fails to account for the many potentially useful ways a concept can be described.",
95
+ "We present ACCoRD, an end-to-end system tackling the novel task of generating sets of descriptions of scientific concepts.",
96
+ "Our system takes advantage of the myriad ways a concept is mentioned across the scientific literature to produce distinct, diverse descriptions of target scientific concepts in terms of different reference concepts.",
97
+ "To support research on the task, we release an expert-annotated resource, the ACCoRD corpus, which includes 1,275 labeled contexts and 1,787 hand-authored concept descriptions.",
98
+ "We conduct a user study demonstrating that (1) users prefer descriptions produced by our end-to-end system, and (2) users prefer multiple descriptions to a single\"best\"description."
99
+ ]
100
+ },
101
+ {
102
+ "title": "I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation",
103
+ "abstract": [
104
+ "Pre-trained language models, despite their rapid advancements powered by scale, still fall short of robust commonsense capabilities.",
105
+ "And yet, scale appears to be the winning recipe; after all, the largest models seem to have acquired the largest amount of commonsense capabilities.",
106
+ "Or is it?",
107
+ "In this paper, we investigate the possibility of a seemingly impossible match: can smaller language models with dismal commonsense capabilities (i.e., GPT-2), ever win over models that are orders of magnitude larger and better (i.e., GPT-3), if the smaller models are powered with novel commonsense distillation algorithms?",
108
+ "The key intellectual question we ask here is whether it is possible, if at all, to design a learning algorithm that does not benefit from scale, yet leads to a competitive level of commonsense acquisition.",
109
+ "In this work, we study the generative models of commonsense knowledge, focusing on the task of generating generics, statements of commonsense facts about everyday concepts, e.g., birds can fly.",
110
+ "We introduce a novel commonsense distillation framework, I2D2, that loosely follows the Symbolic Knowledge Distillation of West et al. but breaks the dependence on the extreme-scale models as the teacher model by two innovations: (1) the novel adaptation of NeuroLogic Decoding to enhance the generation quality of the weak, off-the-shelf language models, and (2) self-imitation learning to iteratively learn from the model's own enhanced commonsense acquisition capabilities.",
111
+ "Empirical results suggest that scale is not the only way, as novel algorithms can be a promising alternative.",
112
+ "Moreover, our study leads to a new corpus of generics, Gen-A-Tomic, that is of the largest and highest quality available to date."
113
+ ]
114
+ },
115
+ {
116
+ "title": "S2AMP: A High-Coverage Dataset of Scholarly Mentorship Inferred from Publications",
117
+ "abstract": [
118
+ "Mentorship is a critical component of academia, but is not as visible as publications, citations, grants, and awards.",
119
+ "Despite the importance of studying the quality and impact of mentorship, there are few large representative mentorship datasets available.",
120
+ "We contribute two datasets to the study of mentorship.",
121
+ "The first has over 300,000 ground truth academic mentor-mentee pairs obtained from multiple diverse, manually-curated sources, and linked to the Semantic Scholar (S2) knowledge graph.",
122
+ "We use this dataset to train an accurate classifier for predicting mentorship relations from bibliographic features, achieving a held-out area under the ROC curve of 0.96.",
123
+ "Our second dataset is formed by applying the classifier to the complete co-authorship graph of S2.",
124
+ "The result is an inferred graph with 137 million weighted mentorship edges among 24 million nodes.",
125
+ "We release this first-of-its-kind dataset to the community to help accelerate the study of scholarly mentorship:https://github.com/allenai/S2AMP-dataCCS CONCEPTS\u2022 Information systems \u2192 Data mining."
126
+ ]
127
+ },
128
+ {
129
+ "title": "Learning to Perform Complex Tasks through Compositional Fine-Tuning of Language Models",
130
+ "abstract": [
131
+ "How to usefully encode compositional task structure has long been a core challenge in AI.",
132
+ "Recent work in chain of thought prompting has shown that for very large neural language models (LMs), explicitly demonstrating the inferential steps involved in a target task may improve performance over end-to-end learning that focuses on the target task alone.",
133
+ "However, chain of thought prompting has significant limitations due to its dependency on huge pretrained LMs.",
134
+ "In this work, we present compositional fine-tuning (CFT): an approach based on explicitly decomposing a target task into component tasks, and then fine-tuning smaller LMs on a curriculum of such component tasks.",
135
+ "We apply CFT to recommendation tasks in two domains, world travel and local dining, as well as a previously studied inferential task (sports understanding).",
136
+ "We show that CFT outperforms end-to-end learning even with equal amounts of data, and gets consistently better as more component tasks are modeled via fine-tuning.",
137
+ "Compared with chain of thought prompting, CFT performs at least as well using LMs only 7.4% of the size, and is moreover applicable to task domains for which data are not available during pretraining."
138
+ ]
139
+ },
140
+ {
141
+ "title": "Building a Shared Conceptual Model of Complex, Heterogeneous Data Systems: A Demonstration",
142
+ "abstract": [
143
+ "The world of data objects and systems is complex and heterogeneous, making collaboration across tools, teams, and institutions difficult.",
144
+ "Important goals like effective data science, responsible data governance, and well-informed data consumption all require participation from multiple parties who share conceptual data models despite being unfamiliar with, or organizationally distant from each other.",
145
+ "In order to be productive together, data collaborators need a shared conceptual model that includes traditional schemas and system models, such as pipelines and procedures.",
146
+ "This shared model does not have to be entirely correct, but to enable effective collaboration, it should be tool-, team-, and institution-independent.",
147
+ "We describe a working demonstration system that aims to build this shared conceptual model.",
148
+ "This system borrows ideas from knowledge graphs and other massive collaborative efforts to curate data artifacts beyond the reach of any one person or institution."
149
+ ]
150
+ },
151
+ {
152
+ "title": "Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities",
153
+ "abstract": [
154
+ "With the advent of large language models, methods for abstractive summarization have made great strides, creating potential for use in applications to aid knowledge workers processing unwieldy document collections.",
155
+ "One such setting is the Civil Rights Litigation Clearinghouse (CRLC) (https://clearinghouse.net),which posts information about large-scale civil rights lawsuits, serving lawyers, scholars, and the general public.",
156
+ "Today, summarization in the CRLC requires extensive training of lawyers and law students who spend hours per case understanding multiple relevant documents in order to produce high-quality summaries of key events and outcomes.",
157
+ "Motivated by this ongoing real-world summarization effort, we introduce Multi-LexSum, a collection of 9,280 expert-authored summaries drawn from ongoing CRLC writing.",
158
+ "Multi-LexSum presents a challenging multi-document summarization task given the length of the source documents, often exceeding two hundred pages per case.",
159
+ "Furthermore, Multi-LexSum is distinct from other datasets in its multiple target summaries, each at a different granularity (ranging from one-sentence\"extreme\"summaries to multi-paragraph narrations of over five hundred words).",
160
+ "We present extensive analysis demonstrating that despite the high-quality summaries in the training data (adhering to strict content and style guidelines), state-of-the-art summarization models perform poorly on this task.",
161
+ "We release Multi-LexSum for further research in summarization methods as well as to facilitate development of applications to assist in the CRLC's mission at https://multilexsum.github.io."
162
+ ]
163
+ },
164
+ {
165
+ "title": "SciRepEval: A Multi-Format Benchmark for Scientific Document Representations",
166
+ "abstract": [
167
+ "Learned representations of scientific documents can serve as valuable input features for downstream tasks, without the need for further fine-tuning.",
168
+ "However, existing benchmarks for evaluating these representations fail to capture the diversity of relevant tasks.",
169
+ "In response, we introduce SciRepEval, the first comprehensive benchmark for training and evaluating scientific document representations.",
170
+ "It includes 25 challenging and realistic tasks, 11 of which are new, across four formats: classification, regression, ranking and search.",
171
+ "We then use the benchmark to study and improve the generalization ability of scientific document representation models.",
172
+ "We show how state-of-the-art models struggle to generalize across task formats, and that simple multi-task training fails to improve them.",
173
+ "However, a new approach that learns multiple embeddings per document, each tailored to a different format, can improve performance.",
174
+ "We experiment with task-format-specific control codes and adapters in a multi-task setting and find that they outperform the existing single-embedding state-of-the-art by up to 1.5 points absolute."
175
+ ]
176
+ },
177
+ {
178
+ "title": "From Who You Know to What You Read: Augmenting Scientific Recommendations with Implicit Social Networks",
179
+ "abstract": [
180
+ "The ever-increasing pace of scientific publication necessitates methods for quickly identifying relevant papers.",
181
+ "While neural recommenders trained on user interests can help, they still result in long, monotonous lists of suggested papers.",
182
+ "To improve the discovery experience we introduce multiple new methods for augmenting recommendations with textual relevance messages that highlight knowledge-graph connections between recommended papers and a user\u2019s publication and interaction history.",
183
+ "We explore associations mediated by author entities and those using citations alone.",
184
+ "In a large-scale, real-world study, we show how our approach significantly increases engagement\u2014and future engagement when mediated by authors\u2014without introducing bias towards highly-cited authors.",
185
+ "To expand message coverage for users with less publication or interaction history, we develop a novel method that highlights connections with proxy authors of interest to users and evaluate it in a controlled lab study.",
186
+ "Finally, we synthesize design implications for future graph-based messages."
187
+ ]
188
+ },
189
+ {
190
+ "title": "CascadER: Cross-Modal Cascading for Knowledge Graph Link Prediction",
191
+ "abstract": [
192
+ "Knowledge graph (KG) link prediction is a fundamental task in artificial intelligence, with applications in natural language processing, information retrieval, and biomedicine.",
193
+ "Recently, promising results have been achieved by leveraging cross-modal information in KGs, using ensembles that combine knowledge graph embeddings (KGEs) and contextual language models (LMs).",
194
+ "However, existing ensembles are either (1) not consistently effective in terms of ranking accuracy gains or (2) impractically inefficient on larger datasets due to the combinatorial explosion problem of pairwise ranking with deep language models.",
195
+ "In this paper, we propose a novel tiered ranking architecture CascadER to maintain the ranking accuracy of full ensembling while improving efficiency considerably.",
196
+ "CascadER uses LMs to rerank the outputs of more efficient base KGEs, relying on an adaptive subset selection scheme aimed at invoking the LMs minimally while maximizing accuracy gain over the KGE.",
197
+ "Extensive experiments demonstrate that CascadER improves MRR by up to 9 points over KGE baselines, setting new state-of-the-art performance on four benchmarks while improving efficiency by one or more orders of magnitude over competitive cross-modal baselines.",
198
+ "Our empirical analyses reveal that diversity of models across modalities and preservation of individual models' confidence signals help explain the effectiveness of CascadER, and suggest promising directions for cross-modal cascaded architectures.",
199
+ "Code and pretrained models are available at https://github.com/tsafavi/cascader."
200
+ ]
201
+ },
202
+ {
203
+ "title": "A Computational Inflection for Scientific Discovery",
204
+ "abstract": [
205
+ "We stand at the foot of a significant inflection in the trajectory of scientific discovery.",
206
+ "As society continues on its fast-paced digital transformation, so does humankind's collective scientific knowledge and discourse.",
207
+ "We now read and write papers in digitized form, and a great deal of the formal and informal processes of science are captured digitally -- including papers, preprints and books, code and datasets, conference presentations, and interactions in social networks and communication platforms.",
208
+ "The transition has led to the growth of a tremendous amount of information, opening exciting opportunities for computational models and systems that analyze and harness it.",
209
+ "In parallel, exponential growth in data processing power has fueled remarkable advances in AI, including self-supervised neural models capable of learning powerful representations from large-scale unstructured text without costly human supervision.",
210
+ "The confluence of societal and computational trends suggests that computer science is poised to ignite a revolution in the scientific process itself.",
211
+ "However, the explosion of scientific data, results and publications stands in stark contrast to the constancy of human cognitive capacity.",
212
+ "While scientific knowledge is expanding with rapidity, our minds have remained static, with severe limitations on the capacity for finding, assimilating and manipulating information.",
213
+ "We propose a research agenda of task-guided knowledge retrieval, in which systems counter humans' bounded capacity by ingesting corpora of scientific knowledge and retrieving inspirations, explanations, solutions and evidence synthesized to directly augment human performance on salient tasks in scientific endeavors.",
214
+ "We present initial progress on methods and prototypes, and lay out important opportunities and challenges ahead with computational approaches that have the potential to revolutionize science."
215
+ ]
216
+ },
217
+ {
218
+ "title": "Infrastructure for Rapid Open Knowledge Network Development",
219
+ "abstract": [
220
+ "The past decade has witnessed a growth in the use of knowledge graph technologies for advanced data search, data integration, and query-answering applications.",
221
+ "The leading example of a public, general-purpose open knowledge network (aka\u00a0knowledge graph) is Wikidata, which has demonstrated remarkable advances in quality and coverage over this time.",
222
+ "Proprietary knowledge graphs drive some of the leading applications of the day including, for example, Google Search, Alexa, Siri, and Cortana.",
223
+ "Open Knowledge Networks are exciting: they promise the power of structured database-like queries with the potential for the wide coverage that is today only provided by the Web.",
224
+ "With the current state of the art, building, using, and scaling large knowledge networks can still be frustratingly slow.",
225
+ "This article describes a National Science Foundation Convergence Accelerator project to build a set of Knowledge Network Programming Infrastructure systems to address this\u00a0issue."
226
+ ]
227
+ },
228
+ {
229
+ "title": "Penguins Don't Fly: Reasoning about Generics through Instantiations and Exceptions",
230
+ "abstract": [
231
+ "Generics express generalizations about the world (e.g., birds can fly) that are not universally true (e.g., newborn birds and penguins cannot fly).",
232
+ "Commonsense knowledge bases, used extensively in NLP, encode some generic knowledge but rarely enumerate such exceptions and knowing when a generic statement holds or does not hold true is crucial for developing a comprehensive understanding of generics.",
233
+ "We present a novel framework informed by linguistic theory to generate exemplars -- specific cases when a generic holds true or false.",
234
+ "We generate ~19k exemplars for ~650 generics and show that our framework outperforms a strong GPT-3 baseline by 12.8 precision points.",
235
+ "Our analysis highlights the importance of linguistic theory-based controllability for generating exemplars, the insufficiency of knowledge bases as a source of exemplars, and the challenges exemplars pose for the task of natural language inference."
236
+ ]
237
+ },
238
+ {
239
+ "title": "Embedding Recycling for Language Models",
240
+ "abstract": [
241
+ "Real-world applications of neural language models often involve running many different models over the same corpus.",
242
+ "The high computational cost of these runs has led to interest in techniques that can reuse the contextualized embeddings produced in previous runs to speed training and inference of future ones.",
243
+ "We refer to this approach as embedding recycling (ER).",
244
+ "While multiple ER techniques have been proposed, their practical effectiveness is still unknown because existing evaluations consider very few models and do not adequately account for overhead costs.",
245
+ "We perform an extensive evaluation of ER across eight different models (17 to 900 million parameters) and fourteen tasks in English.",
246
+ "We show how a simple ER technique that caches activations from an intermediate layer of a pretrained model, and learns task-specific adapters on the later layers, is broadly effective.",
247
+ "For the best-performing baseline in our experiments (DeBERTa-v2 XL), adding a precomputed cache results in a>90% speedup during training and 87-91% speedup for inference, with negligible impact on accuracy.",
248
+ "Our analysis reveals important areas of future work."
249
+ ]
250
+ }
251
+ ],
252
+ "user_kps": [
253
+ "biomedical literature mining",
254
+ "citation context analysis",
255
+ "citation network",
256
+ "co-authorship networks",
257
+ "coherent summaries",
258
+ "collaborative writing",
259
+ "conceptual data models",
260
+ "conceptual tool",
261
+ "document representations",
262
+ "e-science",
263
+ "exploratory searches",
264
+ "human readers",
265
+ "knowledge graph",
266
+ "knowledge graph embedding",
267
+ "literature-based discovery",
268
+ "machine comprehension",
269
+ "neural language models",
270
+ "neural ranking models",
271
+ "terminology extraction",
272
+ "textual summaries"
273
+ ]
274
+ }
data/users/hzamani/seedset-hzamani-maple.json ADDED
@@ -0,0 +1,274 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "hzamani",
3
+ "s2_authorid": "2499986",
4
+ "papers": [
5
+ {
6
+ "title": "You can't pick your neighbors, or can you? When and how to rely on retrieval in the $k$NN-LM",
7
+ "abstract": [
8
+ "Retrieval-enhanced language models (LMs), which condition their predictions on text retrieved from large external datastores, have re-cently shown signi\ufb01cant perplexity improvements compared to standard LMs.",
9
+ "One such approach, the k NN-LM, interpolates any existing LM\u2019s predictions with the output of a k nearest neighbors model and requires no additional training.",
10
+ "In this paper, we explore the importance of lexical and semantic matching in the context of items retrieved by k NN-LM.",
11
+ "We \ufb01nd two trends: (1) the presence of large overlapping n -grams between the datastore and evaluation set plays an important fac-tor in strong performance, even when the datastore is derived from the training data; and (2) the k NN-LM is most bene\ufb01cial when retrieved items have high semantic similarity with the query.",
12
+ "Based on our analysis, we de\ufb01ne a new formulation of the k NN-LM that uses retrieval quality to assign the interpolation coef\ufb01cient.",
13
+ "We empirically measure the effectiveness of our approach on two English language modeling datasets, Wikitext-103 and PG-19.",
14
+ "Our re-formulation of the k NN-LM is bene\ufb01cial in both cases, and leads to nearly 4% improvement in perplexity on the Wikitext-103 test set."
15
+ ]
16
+ },
17
+ {
18
+ "title": "Maruna Bot: An Extensible Retrieval-Focused Framework for Task-Oriented Dialogues",
19
+ "abstract": [
20
+ "We present Maruna Bot, a Task-Oriented Dialogue System (TODS) that assists people in cooking or Do-It-Yourself (DIY) tasks using either a speech-only or multi-modal (speech and screen) interface.",
21
+ "Building such a system is challenging, because it touches many research areas including language understanding, text generation, task planning, dialogue state tracking, question answering, multi-modal retrieval, instruction summarization, robustness, and result presentation, among others.",
22
+ "Our bot lets users choose their desired tasks with flexible phrases, uses multi-stage intent classification, asks clarifying questions to improve retrieval, supports in-task and open-domain Question Answering throughout the conversation, effectively maintains the task status, performs query expansion and instruction re-ranking using both textual and visual signals."
23
+ ]
24
+ },
25
+ {
26
+ "title": "Conversational Information Seeking",
27
+ "abstract": [
28
+ "Conversational information seeking (CIS) is concerned with a sequence of interactions between one or more users and an information system.",
29
+ "Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures.",
30
+ "This monograph provides a thorough overview of CIS definitions, applications, interactions, interfaces, design, implementation, and evaluation.",
31
+ "This monograph views CIS applications as including conversational search, conversational question answering, and conversational recommendation.",
32
+ "Our aim is to provide an overview of past research related to CIS, introduce the current state-of-the-art in CIS, highlight the challenges still being faced in the community.",
33
+ "and suggest future directions."
34
+ ]
35
+ },
36
+ {
37
+ "title": "Curriculum Learning for Dense Retrieval Distillation",
38
+ "abstract": [
39
+ "Recent work has shown that more effective dense retrieval models can be obtained by distilling ranking knowledge from an existing base re-ranking model.",
40
+ "In this paper, we propose a generic curriculum learning based optimization framework called CL-DRD that controls the difficulty level of training data produced by the re-ranking (teacher) model.",
41
+ "CL-DRD iteratively optimizes the dense retrieval (student) model by increasing the difficulty of the knowledge distillation data made available to it.",
42
+ "In more detail, we initially provide the student model coarse-grained preference pairs between documents in the teacher's ranking, and progressively move towards finer-grained pairwise document ordering requirements.",
43
+ "In our experiments, we apply a simple implementation of the CL-DRD framework to enhance two state-of-the-art dense retrieval models.",
44
+ "Experiments on three public passage retrieval datasets demonstrate the effectiveness of our proposed framework."
45
+ ]
46
+ },
47
+ {
48
+ "title": "Stochastic Optimization of Text Set Generation for Learning Multiple Query Intent Representations",
49
+ "abstract": [
50
+ "Learning multiple intent representations for queries has potential applications in facet generation, document ranking, search result diversification, and search explanation.",
51
+ "The state-of-the-art model for this task assumes that there is a sequence of intent representations.",
52
+ "In this paper, we argue that the model should not be penalized as long as it generates an accurate and complete set of intent representations.",
53
+ "Based on this intuition, we propose a stochastic permutation invariant approach for optimizing such networks.",
54
+ "We extrinsically evaluate the proposed approach on a facet generation task and demonstrate significant improvements compared to competitive baselines.",
55
+ "Our analysis shows that the proposed permutation invariant approach has the highest impact on queries with more potential intents."
56
+ ]
57
+ },
58
+ {
59
+ "title": "The cardioprotective effects of nano\u2010curcumin against doxorubicin\u2010induced cardiotoxicity: A systematic review",
60
+ "abstract": [
61
+ "Although the chemotherapeutic drug, doxorubicin, is commonly used to treat various malignant tumors, its clinical use is restricted because of its toxicity especially cardiotoxicity.",
62
+ "The use of curcumin may alleviate some of the doxorubicin\u2010induced cardiotoxic effects.",
63
+ "Especially, using the nano\u2010formulation of curcumin can overcome the poor bioavailability of curcumin and enhance its physicochemical properties regarding its efficacy.",
64
+ "In this study, we systematically reviewed the potential cardioprotective effects of nano\u2010curcumin against the doxorubicin\u2010induced cardiotoxicity.",
65
+ "A systematic search was accomplished based on Preferred Reporting Items for Systematic Reviews and Meta\u2010Analyses guidelines for the identification of all relevant articles on \u201cthe role of nano\u2010curcumin on doxorubicin\u2010induced cardiotoxicity\u201d in the electronic databases of Scopus, PubMed, and Web of Science up to July 2021.",
66
+ "One hundred and sixty\u2010nine articles were screened following a predefined set of inclusion and exclusion criteria.",
67
+ "Ten eligible scientific papers were finally included in the present systematic review.",
68
+ "The administration of doxorubicin reduced the body and heart weights of mice/rats compared to the control groups.",
69
+ "In contrast, the combined treatment of doxorubicin and nano\u2010curcumin increased the body and heart weights of animals compared with the doxorubicin\u2010treated groups alone.",
70
+ "Furthermore, doxorubicin could significantly induce the biochemical and histological changes in the cardiac tissue; however, coadministration of nano\u2010curcumin formulation demonstrated a pattern opposite to the doxorubicin\u2010induced changes.",
71
+ "The coadministration of nano\u2010curcumin alleviates the doxorubicin\u2010induced cardiotoxicity through various mechanisms including antioxidant, anti\u2010inflammatory, and antiapoptotic effects.",
72
+ "Also, the cardioprotective effect of nano\u2010curcumin formulation against doxorubicin\u2010induced cardiotoxicity was higher than free curcumin."
73
+ ]
74
+ },
75
+ {
76
+ "title": "Multi-Task Retrieval-Augmented Text Generation with Relevance Sampling",
77
+ "abstract": [
78
+ "This paper studies multi-task training of retrieval-augmented generation models for knowledge-intensive tasks.",
79
+ "We propose to clean the training set by utilizing a distinct property of knowledge-intensive generation: The connection of query-answer pairs to items in the knowledge base.",
80
+ "We \ufb01lter training examples via a threshold of con\ufb01dence on the relevance labels, whether a pair is answerable by the knowledge base or not.",
81
+ "We train a single Fusion-in-Decoder (FiD) generator on seven combined tasks of the KILT benchmark.",
82
+ "The experimental results suggest that our simple yet effective approach substantially improves competitive baselines on two strongly imbalanced tasks; and shows either smaller improvements or no signi\ufb01cant regression on the remaining tasks.",
83
+ "Furthermore, we demonstrate our multi-task training with relevance label sampling scales well with increased model capacity and achieves state-of-the-art results in \ufb01ve out of seven KILT tasks."
84
+ ]
85
+ },
86
+ {
87
+ "title": "Revisiting Open Domain Query Facet Extraction and Generation",
88
+ "abstract": [
89
+ "Web search queries can often be characterized by various facets.",
90
+ "Extracting and generating query facets has various real-world applications, such as displaying facets to users in a search interface, search result diversification, clarifying question generation, and enabling exploratory search.",
91
+ "In this work, we revisit the task of query facet extraction and generation and study various formulations of this task, including facet extraction as sequence labeling, facet generation as autoregressive text generation or extreme multi-label classification.",
92
+ "We conduct extensive experiments and demonstrate that these approaches lead to complementary sets of facets.",
93
+ "We also explored various aggregation approaches based on relevance and diversity to combine the facet sets produced by different formulations of the task.",
94
+ "The approaches presented in this paper outperform state-of-the-art baselines in terms of both precision and recall.",
95
+ "We confirm the quality of the proposed methods through manual annotation.",
96
+ "Since there is no open-source software for facet extraction and generation, we release a toolkit named Faspect, that includes various model implementations for this task."
97
+ ]
98
+ },
99
+ {
100
+ "title": "Conversational Information Seeking: Theory and Application",
101
+ "abstract": [
102
+ "Conversational information seeking (CIS) involves interaction sequences between one or more users and an information system.",
103
+ "Interactions in CIS are primarily based on natural language dialogue, while they may include other types of interactions, such as click, touch, and body gestures.",
104
+ "CIS recently attracted significant attention and advancements continue to be made.",
105
+ "This tutorial follows the content of the recent Conversational Information Seeking book authored by several of the tutorial presenters.",
106
+ "The tutorial aims to be an introduction to CIS for newcomers to CIS in addition to the recent advanced topics and state-of-the-art approaches for students and researchers with moderate knowledge of the topic.",
107
+ "A significant part of the tutorial is dedicated to hands-on experiences based on toolkits developed by the presenters for conversational passage retrieval and multi-modal task-oriented dialogues.",
108
+ "The outcomes of this tutorial include theoretical and practical knowledge, including a forum to meet researchers interested in CIS."
109
+ ]
110
+ },
111
+ {
112
+ "title": "MIMICS-Duo: Offline & Online Evaluation of Search Clarification",
113
+ "abstract": [
114
+ "Asking clarification questions is an active area of research; however, resources for training and evaluating search clarification methods are not sufficient.",
115
+ "To address this issue, we describe MIMICS-Duo, a new freely available dataset of 306 search queries with multiple clarifications (a total of 1,034 query-clarification pairs).",
116
+ "MIMICS-Duo contains fine-grained annotations on clarification questions and their candidate answers and enhances the existing MIMICS datasets by enabling multi-dimensional evaluation of search clarification methods, including online and offline evaluation.",
117
+ "We conduct extensive analysis to demonstrate the relationship between offline and online search clarification datasets and outline several research directions enabled by MIMICS-Duo.",
118
+ "We believe that this resource will help researchers better understand clarification in search."
119
+ ]
120
+ },
121
+ {
122
+ "title": "Are We There Yet? A Decision Framework for Replacing Term Based Retrieval with Dense Retrieval Systems",
123
+ "abstract": [
124
+ "Recently, several dense retrieval (DR) models have demonstrated competitive performance to term-based retrieval that are ubiquitous in search systems.",
125
+ "In contrast to term-based matching, DR projects queries and documents into a dense vector space and retrieves results via (approximate) nearest neighbor search.",
126
+ "Deploying a new system, such as DR, inevitably involves tradeoffs in aspects of its performance.",
127
+ "Established retrieval systems running at scale are usually well understood in terms of effectiveness and costs, such as query latency, indexing throughput, or storage requirements.",
128
+ "In this work, we propose a framework with a set of criteria that go beyond simple effectiveness measures to thoroughly compare two retrieval systems with the explicit goal of assessing the readiness of one system to replace the other.",
129
+ "This includes careful tradeoff considerations between effectiveness and various cost factors.",
130
+ "Furthermore, we describe guardrail criteria, since even a system that is better on average may have systematic failures on a minority of queries.",
131
+ "The guardrails check for failures on certain query characteristics and novel failure types that are only possible in dense retrieval systems.",
132
+ "We demonstrate our decision framework on a Web ranking scenario.",
133
+ "In that scenario, state-of-the-art DR models have surprisingly strong results, not only on average performance but passing an extensive set of guardrail tests, showing robustness on different query characteristics, lexical matching, generalization, and number of regressions.",
134
+ "DR with approximate nearest neighbor search has comparable low query latency to term-based systems.",
135
+ "The main reason to reject current DR models in this scenario is the cost of vectorization, which is much higher than the cost of building a traditional index.",
136
+ "It is impossible to predict whether DR will become ubiquitous in the future, but one way this is possible is through repeated applications of decision processes such as the one presented here."
137
+ ]
138
+ },
139
+ {
140
+ "title": "Stochastic Retrieval-Conditioned Reranking",
141
+ "abstract": [
142
+ "The multi-stage cascaded architecture has been adopted by many search engines for efficient and effective retrieval.",
143
+ "This architecture consists of a stack of retrieval and reranking models in which efficient retrieval models are followed by effective (neural) learning-to-rank models.",
144
+ "The optimization of these learning-to-rank models is loosely connected to the early stage retrieval models.",
145
+ "This paper draws theoretical connections between the early stage retrieval and late stage reranking models by deriving expected reranking performance conditioned on the early stage retrieval results.",
146
+ "Our findings shed light on optimization of both retrieval and reranking models.",
147
+ "As a result, we also introduce a novel loss function for training reranking models that leads to significant improvements on multiple public benchmarks.",
148
+ "Our findings provide theoretical and empirical guidelines for developing multi-stage cascaded retrieval models."
149
+ ]
150
+ },
151
+ {
152
+ "title": "Predicting Prerequisite Relations for Unseen Concepts",
153
+ "abstract": [
154
+ "Concept prerequisite learning (CPL) plays a key role in developing technologies that assist people to learn a new complex topic or concept.",
155
+ "Previous work commonly assumes that all concepts are given at training time and solely focuses on predicting the unseen prerequisite relationships between them.",
156
+ "However, many real-world scenarios deal with concepts that are left undiscovered at training time, which is relatively unexplored.",
157
+ "This paper studies this problem and proposes a novel alternating knowledge distillation approach to take advantage of both content- and graph-based models for this task.",
158
+ "Extensive experiments on three public benchmarks demonstrate up to 10% improvements in terms of F1 score."
159
+ ]
160
+ },
161
+ {
162
+ "title": "Entrance Surface Dose Measurement at Thyroid and Parotid Gland Regions in Cone-Beam Computed Tomography and Panoramic Radiography",
163
+ "abstract": [
164
+ "Purpose: Ionizing radiation-absorbed doses is a crucial concern in Cone-Beam Computed Tomography (CBCT) and panoramic radiography.",
165
+ "This study aimed to evaluate and compare the Entrance Skin Doses (ESD) of thyroid and parotid gland regions in CBCT and panoramic radiography in Yazd province, Iran.",
166
+ "\nMaterials and Methods: In this cross-sectional study, 332 patients were included, who were then divided into two age groups (adult and pediatric) and underwent dental CBCT and panoramic radiography.",
167
+ "Twelve Thermoluminescence Dosimeters (TLD- GR200) were used for each patient to measure the ESD of thyroid and parotid glands.",
168
+ "The differences between the ESD values in CBCT and panoramic examinations as well as between the adults and children groups were evaluated by one-way ANOVA and Man-Whitney tests.",
169
+ "\nResults: The mean and Standard Deviation (SD) values of ESD in panoramic imaging were equal to 61 \u00b1 4 and 290 \u00b1 12 \u00b5Gy for the thyroid and parotid glands of the adult groups, respectively.",
170
+ "Notably, these values for CBCT were significantly higher (P<0.01), as 377 \u00b1 139 and 1554 \u00b1 177 \u00b5Gy, respectively.",
171
+ "Moreover, the mean ESD values in the panoramic examination were 41 \u00b1 3 and 190 \u00b1 16 \u00b5Gy for thyroid and parotid glands for the children group, while they were 350 \u00b1 120 and 990 \u00b1 107 \u00b5Gy in CBCT (P<0.01), respectively.",
172
+ "The ESD values in the parotid gland were approximately 3.4 (2.8-4.1) and 4.7 (4.6-4.8) times greater than those for CBCT and panoramic examinations, respectively.",
173
+ "\nConclusion: Although CBCT provides supplementary diagnostic advantages, the thyroid and parotid glands\u2019 doses are higher than panoramic radiography.",
174
+ "Therefore, the risks and benefits of each method should be considered before their prescription."
175
+ ]
176
+ },
177
+ {
178
+ "title": "Estimating the risks of exposure-induced death associated with common computed tomography procedures",
179
+ "abstract": [
180
+ "Background : This study aimed to assess the risks of exposure - induced death (REID) in patients and embryos during CT examinations in Yazd province (Iran).",
181
+ "Materials and Methods: Data on the exposure parameters were retrospectively collected from six imaging institutions.",
182
+ "In total, 932 patients were included in this study and for each patient, organ doses were then estimated using ImpactDose software.",
183
+ "The REIDs were calculated by BEIR VII risk model and using PCXMC software.",
184
+ "In the case of gestational irradiation, excess cancer risk of 0.006% per mSv was taken into account in terms of the ICRP 84 recommendations, to calculate the excess childhood cancer risk imposed on the embryo.",
185
+ "Results: The highest estimated organ doses for abdomen - pelvis, routine chest, chest HRCT, brain, and sinus examinations were obtained as 12.82 mSv for kidneys, 12.09 mSv for thymus, 13.16 mSv for thymus, 29.71 mSv for brain, and 11.70 mSv for oral mucosa, respectively.",
186
+ "Across all procedures, abdomen - pelvis CT scan induced the highest excess REID to the patients (240 deaths per million).",
187
+ "The highest delivered dose to the fetus was roughly 35 mSv, which was lower than the threshold dose proposed by ICRP (100 mSv) for the induction of malformations.",
188
+ "However, the associated excess fatal childhood cancer risk of 2122 incidence per million scans can be a subject of concern for public health experts.",
189
+ "Conclusion: Based on the results, although death risks related to induced cancer from CT scans were negligible, this risk can be relatively significant for children exposed during the fetal period."
190
+ ]
191
+ },
192
+ {
193
+ "title": "Retrieval-Enhanced Machine Learning",
194
+ "abstract": [
195
+ "Although information access systems have long supportedpeople in accomplishing a wide range of tasks, we propose broadening the scope of users of information access systems to include task-driven machines, such as machine learning models.",
196
+ "In this way, the core principles of indexing, representation, retrieval, and ranking can be applied and extended to substantially improve model generalization, scalability, robustness, and interpretability.",
197
+ "We describe a generic retrieval-enhanced machine learning (REML) framework, which includes a number of existing models as special cases.",
198
+ "REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization.",
199
+ "The REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence."
200
+ ]
201
+ },
202
+ {
203
+ "title": "Analyzing clarification in asynchronous information\u2010seeking conversations",
204
+ "abstract": [
205
+ "This research analyzes human\u2010generated clarification questions to provide insights into how they are used to disambiguate and provide a better understanding of information needs.",
206
+ "A set of clarification questions is extracted from posts on the Stack Exchange platform.",
207
+ "Novel taxonomy is defined for the annotation of the questions and their responses.",
208
+ "We investigate the clarification questions in terms of whether they add any information to the post (the initial question posted by the asker) and the accepted answer, which is the answer chosen by the asker.",
209
+ "After identifying, which clarification questions are more useful, we investigated the characteristics of these questions in terms of their types and patterns.",
210
+ "Non\u2010useful clarification questions are identified, and their patterns are compared with useful clarifications.",
211
+ "Our analysis indicates that the most useful clarification questions have similar patterns, regardless of topic.",
212
+ "This research contributes to an understanding of clarification in conversations and can provide insight for clarification dialogues in conversational search scenarios and for the possible system generation of clarification requests in information\u2010seeking conversations."
213
+ ]
214
+ },
215
+ {
216
+ "title": "DISAPERE: A Dataset for Discourse Structure in Peer Review Discussions",
217
+ "abstract": [
218
+ "At the foundation of scientific evaluation is the labor-intensive process of peer review.",
219
+ "This critical task requires participants to consume vast amounts of highly technical text.",
220
+ "Prior work has annotated different aspects of review argumentation, but discourse relations between reviews and rebuttals have yet to be examined.",
221
+ "We present DISAPERE, a labeled dataset of 20k sentences contained in 506 review-rebuttal pairs in English, annotated by experts.",
222
+ "DISAPERE synthesizes label sets from prior work and extends them to include fine-grained annotation of the rebuttal sentences, characterizing their context in the review and the authors\u2019 stance towards review arguments.",
223
+ "Further, we annotate every review and rebuttal sentence.",
224
+ "We show that discourse cues from rebuttals can shed light on the quality and interpretation of reviews.",
225
+ "Further, an understanding of the argumentative strategies employed by the reviewers and authors provides useful signal for area chairs and other decision makers."
226
+ ]
227
+ },
228
+ {
229
+ "title": "FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation",
230
+ "abstract": [
231
+ "Retrieval-augmented generation models offer many bene\ufb01ts over standalone language models: besides a textual answer to a given query they provide provenance items retrieved from an updateable knowledge base.",
232
+ "However, they are also more complex systems and need to handle long inputs.",
233
+ "In this work, we introduce FiD-Light to strongly increase the ef\ufb01ciency of the state-of-the-art retrieval-augmented FiD model, while maintaining the same level of effectiveness.",
234
+ "Our FiD-Light model constrains the information \ufb02ow from the encoder (which encodes passages separately) to the decoder (using concatenated encoded representations).",
235
+ "Fur-thermore, we adapt FiD-Light with re-ranking capabilities through textual source pointers, to improve the top-ranked provenance precision.",
236
+ "Our experiments on a diverse set of seven knowledge intensive tasks (KILT) show FiD-Light consistently improves the Pareto frontier between query latency and effectiveness.",
237
+ "FiD-Light with source pointing sets substantial new state-of-the-art results on six KILT tasks for combined text generation and provenance retrieval evaluation, while maintaining reasonable ef\ufb01ciency."
238
+ ]
239
+ },
240
+ {
241
+ "title": "Generalizing Discriminative Retrieval Models using Generative Tasks",
242
+ "abstract": [
243
+ "Information Retrieval has a long history of applying either discriminative or generative modeling to retrieval and ranking tasks.",
244
+ "Recent developments in transformer architectures and multi-task learning techniques have dramatically improved our ability to train effective neural models capable of resolving a wide variety of tasks using either of these paradigms.",
245
+ "In this paper, we propose a novel multi-task learning approach which can be used to produce more effective neural ranking models.",
246
+ "The key idea is to improve the quality of the underlying transformer model by cross-training a retrieval task and one or more complementary language generation tasks.",
247
+ "By targeting the training on the encoding layer in the transformer architecture, our experimental results show that the proposed multi-task learning approach consistently improves retrieval effectiveness on the targeted collection and can easily be re-targeted to new ranking tasks.",
248
+ "We provide an in-depth analysis showing how multi-task learning modifies model behaviors, resulting in more general models."
249
+ ]
250
+ }
251
+ ],
252
+ "user_kps": [
253
+ "argumentation mining",
254
+ "attention-based neural machine translation",
255
+ "cone beam computed tomography",
256
+ "conversational interactivity",
257
+ "conversational interfaces",
258
+ "dialogue systems",
259
+ "discriminative language modeling",
260
+ "exploratory search tasks",
261
+ "faceted search",
262
+ "learning concepts",
263
+ "neural ranking models",
264
+ "question answering",
265
+ "radiation dose",
266
+ "ranked retrieval",
267
+ "retrieval model",
268
+ "retrieval tasks",
269
+ "similarity-based retrieval",
270
+ "term networks",
271
+ "therapeutic targets",
272
+ "word retrieval"
273
+ ]
274
+ }
data/users/jbragg/seedset-jbragg-maple.json ADDED
@@ -0,0 +1,265 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "jbragg",
3
+ "s2_authorid": "2699105",
4
+ "papers": [
5
+ {
6
+ "title": "The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces",
7
+ "abstract": [
8
+ "Scholarly publications are key to the transfer of knowledge from scholars to others.",
9
+ "However, research papers are information-dense, and as the volume of the scientific literature grows, the need for new technology to support the reading process grows.",
10
+ "In contrast to the process of finding papers, which has been transformed by Internet technology, the experience of reading research papers has changed little in decades.",
11
+ "The PDF format for sharing research papers is widely used due to its portability, but it has significant downsides including: static content, poor accessibility for low-vision readers, and difficulty reading on mobile devices.",
12
+ "This paper explores the question\"Can recent advances in AI and HCI power intelligent, interactive, and accessible reading interfaces -- even for legacy PDFs?\"We describe the Semantic Reader Project, a collaborative effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers.",
13
+ "Through this project, we've developed ten research prototype interfaces and conducted usability studies with more than 300 participants and real-world users showing improved reading experiences for scholars.",
14
+ "We've also released a production reading interface for research papers that will incorporate the best features as they mature.",
15
+ "We structure this paper around challenges scholars and the public face when reading research papers -- Discovery, Efficiency, Comprehension, Synthesis, and Accessibility -- and present an overview of our progress and remaining open challenges."
16
+ ]
17
+ },
18
+ {
19
+ "title": "ComLittee: Literature Discovery with Personal Elected Author Committees",
20
+ "abstract": [
21
+ "In order to help scholars understand and follow a research topic, significant research has been devoted to creating systems that help scholars discover relevant papers and authors.",
22
+ "Recent approaches have shown the usefulness of highlighting relevant authors while scholars engage in paper discovery.",
23
+ "However, these systems do not capture and utilize users' evolving knowledge of authors.",
24
+ "We reflect on the design space and introduce ComLittee, a literature discovery system that supports author-centric exploration.",
25
+ "In contrast to paper-centric interaction in prior systems, ComLittee's author-centric interaction supports curation of research threads from individual authors, finding new authors and papers with combined signals from a paper recommender and the curated authors' authorship graphs, and understanding them in the context of those signals.",
26
+ "In a within-subjects experiment that compares to an author-highlighting approach, we demonstrate how ComLittee leads to a higher efficiency, quality, and novelty in author discovery that also improves paper discovery."
27
+ ]
28
+ },
29
+ {
30
+ "title": "CiteSee: Augmenting Citations in Scientific Papers with Persistent and Personalized Historical Context",
31
+ "abstract": [
32
+ "When reading a scholarly article, inline citations help researchers contextualize the current article and discover relevant prior work.",
33
+ "However, it can be challenging to prioritize and make sense of the hundreds of citations encountered during literature reviews.",
34
+ "This paper introduces CiteSee, a paper reading tool that leverages a user's publishing, reading, and saving activities to provide personalized visual augmentations and context around citations.",
35
+ "First, CiteSee connects the current paper to familiar contexts by surfacing known citations a user had cited or opened.",
36
+ "Second, CiteSee helps users prioritize their exploration by highlighting relevant but unknown citations based on saving and reading history.",
37
+ "We conducted a lab study that suggests CiteSee is significantly more effective for paper discovery than three baselines.",
38
+ "A field deployment study shows CiteSee helps participants keep track of their explorations and leads to better situational awareness and increased paper discovery via inline citation when conducting real-world literature reviews."
39
+ ]
40
+ },
41
+ {
42
+ "title": "Relatedly: Scaffolding Literature Reviews with Existing Related Work Sections",
43
+ "abstract": [
44
+ "Scholars who want to research a scientific topic must take time to read, extract meaning, and identify connections across many papers.",
45
+ "As scientific literature grows, this becomes increasingly challenging.",
46
+ "Meanwhile, authors summarize prior research in papers' related work sections, though this is scoped to support a single paper.",
47
+ "A formative study found that while reading multiple related work paragraphs helps overview a topic, it is hard to navigate overlapping and diverging references and research foci.",
48
+ "In this work, we design a system, Relatedly, that scaffolds exploring and reading multiple related work paragraphs on a topic, with features including dynamic re-ranking and highlighting to spotlight unexplored dissimilar information, auto-generated descriptive paragraph headings, and low-lighting of redundant information.",
49
+ "From a within-subjects user study (n=15), we found that scholars generate more coherent, insightful, and comprehensive topic outlines using Relatedly compared to a baseline paper list."
50
+ ]
51
+ },
52
+ {
53
+ "title": "The Semantic Scholar Open Data Platform",
54
+ "abstract": [
55
+ "The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field.",
56
+ "Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature.",
57
+ "We combine public and proprietary data sources using state-of-theart techniques for scholarly PDF content extraction and automatic knowledge graph construction to build the Semantic Scholar Academic Graph, the largest open scientific literature graph to-date, with 200M+ papers, 80M+ authors, 550M+ paper-authorship edges, and 2.4B+ citation edges.",
58
+ "The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings.",
59
+ "In this paper, we describe the components of the S2 data processing pipeline and the associated APIs offered by the platform.",
60
+ "We will update this living document to reflect changes as we add new data offerings and improve existing services."
61
+ ]
62
+ },
63
+ {
64
+ "title": "Beyond Summarization: Designing AI Support for Real-World Expository Writing Tasks",
65
+ "abstract": [
66
+ "Large language models have introduced exciting new opportunities and challenges in designing and developing new AI-assisted writing support tools.",
67
+ "Recent work has shown that leveraging this new technology can transform writing in many scenarios such as ideation during creative writing, editing support, and summarization.",
68
+ "However, AI-supported expository writing--including real-world tasks like scholars writing literature reviews or doctors writing progress notes--is relatively understudied.",
69
+ "In this position paper, we argue that developing AI supports for expository writing has unique and exciting research challenges and can lead to high real-world impacts.",
70
+ "We characterize expository writing as evidence-based and knowledge-generating: it contains summaries of external documents as well as new information or knowledge.",
71
+ "It can be seen as the product of authors' sensemaking process over a set of source documents, and the interplay between reading, reflection, and writing opens up new opportunities for designing AI support.",
72
+ "We sketch three components for AI support design and discuss considerations for future research."
73
+ ]
74
+ },
75
+ {
76
+ "title": "A Dataset of Alt Texts from HCI Publications: Analyses and Uses Towards Producing More Descriptive Alt Texts of Data Visualizations in Scientific Papers",
77
+ "abstract": [
78
+ "Figures in scientific publications contain important information and results, and alt text is needed for blind and low vision readers to engage with their content.",
79
+ "We conduct a study to characterize the semantic content of alt text in HCI publications based on a framework introduced by Lundgard and Satyanarayan [30].",
80
+ "Our study focuses on alt text for graphs, charts, and plots extracted from HCI and accessibility publications; we focus on these communities due to the lack of alt text in papers published outside of these disciplines.",
81
+ "We find that the capacity of author-written alt text to fulfill blind and low vision user needs is mixed; for example, only 50% of alt texts in our sample contain information about extrema or outliers, and only 31% contain information about major trends or comparisons conveyed by the graph.",
82
+ "We release our collected dataset of author-written alt text, and outline possible ways that it can be used to develop tools and models to assist future authors in writing better alt text.",
83
+ "Based on our findings, we also discuss recommendations that can be acted upon by publishers and authors to encourage inclusion of more types of semantic content in alt text."
84
+ ]
85
+ },
86
+ {
87
+ "title": "CiteRead: Integrating Localized Citation Contexts into Scientific Paper Reading",
88
+ "abstract": [
89
+ "When reading a scholarly paper, scientists oftentimes wish to understand how follow-on work has built on or engages with what they are reading.",
90
+ "While a paper itself can only discuss prior work, some scientific search engines can provide a list of all subsequent citing papers; unfortunately, they are undifferentiated and disconnected from the contents of the original reference paper.",
91
+ "In this work, we introduce a novel paper reading experience that integrates relevant information about follow-on work directly into a paper, allowing readers to learn about newer papers and see how a paper is discussed by its citing papers in the context of the reference paper.",
92
+ "We built a tool, called CiteRead, that implements the following three contributions: 1) automated techniques for selecting important citing papers, building on results from a formative study we conducted, 2) an automated process for localizing commentary provided by citing papers to a place in the reference paper, and 3) an interactive experience that allows readers to seamlessly alternate between the reference paper and information from citing papers (e.g., citation sentences), placed in the margins.",
93
+ "Based on a user study with 12 scientists, we found that in comparison to having just a list of citing papers and their citation sentences, the use of CiteRead while reading allows for better comprehension and retention of information about follow-on work."
94
+ ]
95
+ },
96
+ {
97
+ "title": "Paper to HTML",
98
+ "abstract": [
99
+ "Most scientific papers are distributed in PDF format, which is by default inaccessible to blind and low vision audiences and people who use assistive reading technology.",
100
+ "These access barriers hinder and may even deter members of these groups from pursuing careers or opportunities that necessitate the reading of technical documents.",
101
+ "In cases where no accessible versions of papers are made available by publishers or authors, the gold standard for PDF document accessibility is PDF remediation.",
102
+ "Remediation is the process by which a PDF is made accessible by fixing accessibility errors, for example, tagging headings, specifying reading order, adding alt text to images, and so on, such that a reader can navigate and engage with the resulting content using assistive reading technology such as screen readers."
103
+ ]
104
+ },
105
+ {
106
+ "title": "FeedLens: Polymorphic Lenses for Personalizing Exploratory Search over Knowledge Graphs",
107
+ "abstract": [
108
+ "The vast scale and open-ended nature of knowledge graphs (KGs) make exploratory search over them cognitively demanding for users.",
109
+ "We introduce a new technique, polymorphic lenses, that improves exploratory search over a KG by obtaining new leverage from the existing preference models that KG-based systems maintain for recommending content.",
110
+ "The approach is based on a simple but powerful observation: in a KG, preference models can be re-targeted to recommend not only entities of a single base entity type (e.g., papers in the scientific literature KG, products in an e-commerce KG), but also all other types (e.g., authors, conferences, institutions; sellers, buyers).",
111
+ "We implement our technique in a novel system, FeedLens, which is built over Semantic Scholar, a production system for navigating the scientific literature KG.",
112
+ "FeedLens reuses the existing preference models on Semantic Scholar\u2014people\u2019s curated research feeds\u2014as lenses for exploratory search.",
113
+ "Semantic Scholar users can curate multiple feeds/lenses for different topics of interest, e.g., one for human-centered AI and another for document embeddings.",
114
+ "Although these lenses are defined in terms of papers, FeedLens re-purposes them to also guide search over authors, institutions, venues, etc.",
115
+ "Our system design is based on feedback from intended users via two pilot surveys (n = 17 and n = 13, respectively).",
116
+ "We compare FeedLens and Semantic Scholar via a third (within-subjects) user study (n = 15) and find that FeedLens increases user engagement while reducing the cognitive effort required to complete a short literature review task.",
117
+ "Our qualitative results also highlight people\u2019s preference for this more effective exploratory search experience enabled by FeedLens."
118
+ ]
119
+ },
120
+ {
121
+ "title": "Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing",
122
+ "abstract": [
123
+ "When seeking information not covered in patient-friendly documents, healthcare consumers may turn to the research literature.",
124
+ "Reading medical papers, however, can be a challenging experience.",
125
+ "To improve access to medical papers, we explore four features enabled by natural language processing: definitions of unfamiliar terms, in-situ plain language section summaries, a collection of key questions that guides readers to answering passages, and plain language summaries of those passages.",
126
+ "We embody these features into a prototype system, Paper Plain.",
127
+ "We evaluate Paper Plain, finding that participants who used the prototype system had an easier time reading research papers without a loss in paper comprehension compared to those who used a typical PDF reader.",
128
+ "Altogether, the study results suggest that guiding readers to relevant passages and providing plain language summaries alongside the original paper content can make reading medical papers easier and give readers more confidence to approach these papers."
129
+ ]
130
+ },
131
+ {
132
+ "title": "Scim: Intelligent Skimming Support for Scientific Papers",
133
+ "abstract": [
134
+ "Scholars need to keep up with an exponentially increasing flood of scientific papers.",
135
+ "To aid this challenge, we introduce Scim, a novel intelligent interface that helps experienced researchers skim \u2013 or rapidly review \u2013 a paper to attain a cursory understanding of its contents.",
136
+ "Scim supports the skimming process by highlighting salient paper contents in order to direct a reader\u2019s attention.",
137
+ "The system\u2019s highlights are faceted by content type, evenly distributed across a paper, and have a density configurable by readers at both the global and local level.",
138
+ "We evaluate Scim with both an in-lab usability study and a longitudinal diary study, revealing how its highlights facilitate the more efficient construction of a conceptualization of a paper.",
139
+ "We conclude by discussing design considerations and tensions for the design of future intelligent skimming tools."
140
+ ]
141
+ },
142
+ {
143
+ "title": "From Who You Know to What You Read: Augmenting Scientific Recommendations with Implicit Social Networks",
144
+ "abstract": [
145
+ "The ever-increasing pace of scientific publication necessitates methods for quickly identifying relevant papers.",
146
+ "While neural recommenders trained on user interests can help, they still result in long, monotonous lists of suggested papers.",
147
+ "To improve the discovery experience we introduce multiple new methods for augmenting recommendations with textual relevance messages that highlight knowledge-graph connections between recommended papers and a user\u2019s publication and interaction history.",
148
+ "We explore associations mediated by author entities and those using citations alone.",
149
+ "In a large-scale, real-world study, we show how our approach significantly increases engagement\u2014and future engagement when mediated by authors\u2014without introducing bias towards highly-cited authors.",
150
+ "To expand message coverage for users with less publication or interaction history, we develop a novel method that highlights connections with proxy authors of interest to users and evaluate it in a controlled lab study.",
151
+ "Finally, we synthesize design implications for future graph-based messages."
152
+ ]
153
+ },
154
+ {
155
+ "title": "Scim: Intelligent Faceted Highlights for Interactive, Multi-Pass Skimming of Scientific Papers",
156
+ "abstract": [
157
+ "Researchers are expected to keep up with an immense literature, yet often find it prohibitively time-consuming to do so.",
158
+ "This paper ex-plores how intelligent agents can help scaffold in-situ information seeking across scientific papers.",
159
+ "Specifically, we present Scim, an AI-augmented reading interface designed to help researchers skim papers by automatically identifying, classifying, and highlighting salient sentences, organized into rhetorical facets rooted in common information needs.",
160
+ "Using Scim as a design probe, we explore the benefits and drawbacks of imperfect AI assistance within an augmented reading interface.",
161
+ "We found researchers used Scim in several different ways: from reading primarily in the \u2018highlight browser\u2019 (side panel) to making multiple passes through the paper with different facets activated (e.g., focusing solely on objective and novelty in their first pass).",
162
+ "From our study, we identify six key design recommendations and avenues for future research in augmented reading interfaces."
163
+ ]
164
+ },
165
+ {
166
+ "title": "Exploring Team-Sourced Hyperlinks to Address Navigation Challenges for Low-Vision Readers of Scientific Papers",
167
+ "abstract": [
168
+ "Reading academic papers is a fundamental part of higher education and research, but navigating these information-dense texts can be challenging.",
169
+ "In particular, low-vision readers using magnification encounter additional barriers to quickly skimming and visually locating information.",
170
+ "In this work, we explored the design of interfaces to enable readers to: 1) navigate papers more easily, and 2) input the required navigation hooks that AI cannot currently automate.",
171
+ "To explore this design space, we ran two exploratory studies.",
172
+ "The first focused on current practices of low-vision paper readers, the challenges they encounter, and the interfaces they desire.",
173
+ "During this study, low-vision participants were interviewed, and tried out four new paper navigation prototypes.",
174
+ "Results from this study grounded the design of our end-to-end system prototype Ocean, which provides an accessible front-end for low-vision readers, and enables all readers to contribute to the backend by leaving traces of their reading paths for others to leverage.",
175
+ "Our second study used this exploratory interface in a field study with groups of low-vision and sighted readers to probe the user experience of reading and creating traces.",
176
+ "Our findings suggest that it may be possible for readers of all abilities to organically leave traces in papers, and that these traces can be used to facilitate navigation tasks, in particular for low-vision readers.",
177
+ "Based on our findings, we present design considerations for creating future paper-reading tools that improve access, and organically source the required data from readers."
178
+ ]
179
+ },
180
+ {
181
+ "title": "GENIE: Toward Reproducible and Standardized Human Evaluation for Text Generation",
182
+ "abstract": [
183
+ "While often assumed a gold standard, effective human evaluation of text generation remains an important, open area for research.",
184
+ "We revisit this problem with a focus on producing consistent evaluations that are reproducible\u2014over time and across different populations.",
185
+ "We study this goal in different stages of the human evaluation pipeline.",
186
+ "In particular, we consider design choices for the annotation interface used to elicit human judgments and their impact on reproducibility.",
187
+ "Furthermore, we develop an automated mechanism for maintaining annotator quality via a probabilistic model that detects and excludes noisy annotators.",
188
+ "Putting these lessons together, we introduce GENIE: a system for running standardized human evaluations across different generation tasks.",
189
+ "We instantiate GENIE with datasets representing four core challenges in text generation: machine translation, summarization, commonsense reasoning, and machine comprehension.",
190
+ "For each task, GENIE offers a leaderboard that automatically crowdsources annotations for submissions, evaluating them along axes such as correctness, conciseness, and fluency.",
191
+ "We have made the GENIE leaderboards publicly available, and have already ranked 50 submissions from 10 different research groups.",
192
+ "We hope GENIE encourages further progress toward effective, standardized evaluations for text generation."
193
+ ]
194
+ },
195
+ {
196
+ "title": "FLEX: Unifying Evaluation for Few-Shot NLP",
197
+ "abstract": [
198
+ "Few-shot NLP research is highly active, yet conducted in disjoint research threads with evaluation suites that lack challenging-yet-realistic testing setups and fail to employ careful experimental design.",
199
+ "Consequently, the community does not know which techniques perform best or even if they outperform simple baselines.",
200
+ "In response, we formulate the FLEX Principles, a set of requirements and best practices for unified, rigorous, valid, and cost-sensitive few-shot NLP evaluation.",
201
+ "These principles include Sample Size Design, a novel approach to benchmark design that optimizes statistical accuracy and precision while keeping evaluation costs manageable.",
202
+ "Following the principles, we release the FLEX benchmark, which includes four few-shot transfer settings, zero-shot evaluation, and a public leaderboard that covers diverse NLP tasks.",
203
+ "In addition, we present UniFew, a prompt-based model for few-shot learning that unifies pretraining and finetuning prompt formats, eschewing complex machinery of recent prompt-based approaches in adapting downstream task formats to language model pretraining objectives.",
204
+ "We demonstrate that despite simplicity, UniFew achieves results competitive with both popular meta-learning and prompt-based approaches."
205
+ ]
206
+ },
207
+ {
208
+ "title": "Improving the Accessibility of Scientific Documents: Current State, User Needs, and a System Solution to Enhance Scientific PDF Accessibility for Blind and Low Vision Users",
209
+ "abstract": [
210
+ "The majority of scientific papers are distributed in PDF, which pose challenges for accessibility, especially for blind and low vision (BLV) readers.",
211
+ "We characterize the scope of this problem by assessing the accessibility of 11,397 PDFs published 2010--2019 sampled across various fields of study, finding that only 2.4% of these PDFs satisfy all of our defined accessibility criteria.",
212
+ "We introduce the SciA11y system to offset some of the issues around inaccessibility.",
213
+ "SciA11y incorporates several machine learning models to extract the content of scientific PDFs and render this content as accessible HTML, with added novel navigational features to support screen reader users.",
214
+ "An intrinsic evaluation of extraction quality indicates that the majority of HTML renders (87%) produced by our system have no or only some readability issues.",
215
+ "We perform a qualitative user study to understand the needs of BLV researchers when reading papers, and to assess whether the SciA11y system could address these needs.",
216
+ "We summarize our user study findings into a set of five design recommendations for accessible scientific reader systems.",
217
+ "User response to SciA11y was positive, with all users saying they would be likely to use the system in the future, and some stating that the system, if available, would become their primary workflow.",
218
+ "We successfully produce HTML renders for over 12M papers, of which an open access subset of 1.5M are available for browsing at https://scia11y.org/."
219
+ ]
220
+ },
221
+ {
222
+ "title": "SciA11y: Converting Scientific Papers to Accessible HTML",
223
+ "abstract": [
224
+ "We present SciA11y, a system that renders inaccessible scientific paper PDFs into HTML.",
225
+ "SciA11y uses machine learning models to extract and understand the content of scientific PDFs, and reorganizes the resulting paper components into a form that better supports skimming and scanning for blind and low vision (BLV) readers.",
226
+ "SciA11y adds navigation features such as tagged headings, a table of contents, and bidirectional links between inline citations and references, which allow readers to resolve citations without losing their context.",
227
+ "A set of 1.5 million open access papers are processed and available at https://scia11y.org/. This system is a first step in addressing scientific PDF accessibility, and may significantly improve the experience of paper reading for BLV users."
228
+ ]
229
+ },
230
+ {
231
+ "title": "GENIE: A Leaderboard for Human-in-the-Loop Evaluation of Text Generation",
232
+ "abstract": [
233
+ "Leaderboards have eased model development for many NLP datasets by standardizing their evaluation and delegating it to an independent external repository.",
234
+ "Their adoption, however, is so far limited to tasks which can be reli-ably evaluated in an automatic manner.",
235
+ "This work introduces G ENIE , an extensible human evaluation leaderboard, which brings the ease of leaderboards to text generation tasks.",
236
+ "G E - NIE automatically posts leaderboard submissions to crowdsourcing platforms asking human annotators to evaluate them on various axes (e.g., correctness, conciseness, \ufb02uency), and compares their answers to various automatic metrics.",
237
+ "We introduce several datasets in English to G ENIE , representing four core challenges in text generation: machine translation, summarization, commonsense reasoning, and machine comprehension.",
238
+ "We provide formal granular evaluation metrics and identify areas for future research.",
239
+ "We make G ENIE publicly available, 1 and hope that it will spur progress in language generation models as well as their automatic and manual evaluation."
240
+ ]
241
+ }
242
+ ],
243
+ "user_kps": [
244
+ "accessible user interfaces",
245
+ "attentive user interfaces",
246
+ "biomedical literature mining",
247
+ "braille documents",
248
+ "citation context analysis",
249
+ "citation network",
250
+ "collaborative writing",
251
+ "document representations",
252
+ "exploratory searches",
253
+ "few-shot learning",
254
+ "hl7 's clinical document architecture",
255
+ "human annotations",
256
+ "human readability",
257
+ "human readers",
258
+ "information accessibility",
259
+ "literature-based discovery",
260
+ "natural language generation",
261
+ "scholarly communication",
262
+ "text comprehension",
263
+ "textual interface"
264
+ ]
265
+ }
data/users/lsoldaini/seedset-lsoldaini-maple.json ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "lsoldaini",
3
+ "s2_authorid": "3328733",
4
+ "papers": [
5
+ {
6
+ "title": "One-Shot Labeling for Automatic Relevance Estimation",
7
+ "abstract": [
8
+ "Dealing with unjudged documents (\"holes\") in relevance assessments is a perennial problem when evaluating search systems with offline experiments.",
9
+ "Holes can reduce the apparent effectiveness of retrieval systems during evaluation and introduce biases in models trained with incomplete data.",
10
+ "In this work, we explore whether large language models can help us fill such holes to improve offline evaluations.",
11
+ "We examine an extreme, albeit common, evaluation setting wherein only a single known relevant document per query is available for evaluation.",
12
+ "We then explore various approaches for predicting the relevance of unjudged documents with respect to a query and the known relevant document, including nearest neighbor, supervised, and prompting techniques.",
13
+ "We find that although the predictions of these One-Shot Labelers (1SLs) frequently disagree with human assessments, the labels they produce yield a far more reliable ranking of systems than the single labels do alone.",
14
+ "Specifically, the strongest approaches can consistently reach system ranking correlations of over 0.85 with the full rankings over a variety of measures.",
15
+ "Meanwhile, the approach substantially reduces the false positive rate of t-tests due to holes in relevance assessments (from 15-30% down to under 5%), giving researchers more confidence in results they find to be significant."
16
+ ]
17
+ },
18
+ {
19
+ "title": "The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces",
20
+ "abstract": [
21
+ "Scholarly publications are key to the transfer of knowledge from scholars to others.",
22
+ "However, research papers are information-dense, and as the volume of the scientific literature grows, the need for new technology to support the reading process grows.",
23
+ "In contrast to the process of finding papers, which has been transformed by Internet technology, the experience of reading research papers has changed little in decades.",
24
+ "The PDF format for sharing research papers is widely used due to its portability, but it has significant downsides including: static content, poor accessibility for low-vision readers, and difficulty reading on mobile devices.",
25
+ "This paper explores the question\"Can recent advances in AI and HCI power intelligent, interactive, and accessible reading interfaces -- even for legacy PDFs?\"We describe the Semantic Reader Project, a collaborative effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers.",
26
+ "Through this project, we've developed ten research prototype interfaces and conducted usability studies with more than 300 participants and real-world users showing improved reading experiences for scholars.",
27
+ "We've also released a production reading interface for research papers that will incorporate the best features as they mature.",
28
+ "We structure this paper around challenges scholars and the public face when reading research papers -- Discovery, Efficiency, Comprehension, Synthesis, and Accessibility -- and present an overview of our progress and remaining open challenges."
29
+ ]
30
+ },
31
+ {
32
+ "title": "The Semantic Scholar Open Data Platform",
33
+ "abstract": [
34
+ "The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field.",
35
+ "Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature.",
36
+ "We combine public and proprietary data sources using state-of-theart techniques for scholarly PDF content extraction and automatic knowledge graph construction to build the Semantic Scholar Academic Graph, the largest open scientific literature graph to-date, with 200M+ papers, 80M+ authors, 550M+ paper-authorship edges, and 2.4B+ citation edges.",
37
+ "The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings.",
38
+ "In this paper, we describe the components of the S2 data processing pipeline and the associated APIs offered by the platform.",
39
+ "We will update this living document to reflect changes as we add new data offerings and improve existing services."
40
+ ]
41
+ },
42
+ {
43
+ "title": "Knowledge Transfer from Answer Ranking to Answer Generation",
44
+ "abstract": [
45
+ "Recent studies show that Question Answering (QA) based on Answer Sentence Selection (AS2) can be improved by generating an improved answer from the top-k ranked answer sentences (termed GenQA).",
46
+ "This allows for synthesizing the information from multiple candidates into a concise, natural-sounding answer.",
47
+ "However, creating large-scale supervised training data for GenQA models is very challenging.",
48
+ "In this paper, we propose to train a GenQA model by transferring knowledge from a trained AS2 model, to overcome the aforementioned issue.",
49
+ "First, we use an AS2 model to produce a ranking over answer candidates for a set of questions.",
50
+ "Then, we use the top ranked candidate as the generation target, and the next k top ranked candidates as context for training a GenQA model.",
51
+ "We also propose to use the AS2 model prediction scores for loss weighting and score-conditioned input/output shaping, to aid the knowledge transfer.",
52
+ "Our evaluation on three public and one large industrial datasets demonstrates the superiority of our approach over the AS2 baseline, and GenQA trained using supervised data."
53
+ ]
54
+ },
55
+ {
56
+ "title": "Exploring the Challenges of Open Domain Multi-Document Summarization",
57
+ "abstract": [
58
+ "Multi-document summarization (MDS) has traditionally been studied assuming a set of ground-truth topic-related input documents is provided.",
59
+ "In practice, the input document set is unlikely to be available a priori and would need to be retrieved based on an information need, a setting we call open-domain MDS.",
60
+ "We experiment with current state-of-the-art retrieval and summarization models on several popular MDS datasets extended to the open-domain setting.",
61
+ "We find that existing summarizers suffer large reductions in performance when applied as-is to this more realistic task, though training summarizers with retrieved inputs can reduce their sensitivity retrieval errors.",
62
+ "To further probe these findings, we conduct perturbation experiments on summarizer inputs to study the impact of different types of document retrieval errors.",
63
+ "Based on our results, we provide practical guidelines to help facilitate a shift to open-domain MDS.",
64
+ "We release our code and experimental results alongside all data or model artifacts created during our investigation."
65
+ ]
66
+ },
67
+ {
68
+ "title": "Paragraph-based Transformer Pre-training for Multi-Sentence Inference",
69
+ "abstract": [
70
+ "Inference tasks such as answer sentence selection (AS2) or fact verification are typically solved by fine-tuning transformer-based models as individual sentence-pair classifiers.",
71
+ "Recent studies show that these tasks benefit from modeling dependencies across multiple candidate sentences jointly.",
72
+ "In this paper, we first show that popular pre-trained transformers perform poorly when used for fine-tuning on multi-candidate inference tasks.",
73
+ "We then propose a new pre-training objective that models the paragraph-level semantics across multiple input sentences.",
74
+ "Our evaluation on three AS2 and one fact verification datasets demonstrates the superiority of our pre-training technique over the traditional ones for transformers used as joint models for multi-candidate inference tasks, as well as when used as cross-encoders for sentence-pair formulations of these tasks."
75
+ ]
76
+ },
77
+ {
78
+ "title": "Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection",
79
+ "abstract": [
80
+ "An important task for designing QA systems is answer sentence selection (AS2): selecting the sentence containing (or constituting) the answer to a question from a set of retrieved relevant documents.",
81
+ "In this paper, we propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents, to improve the performance of transformers for AS2, and mitigate the requirement of large labeled datasets.",
82
+ "Specifically, the model is tasked to predict whether: (i) two sentences are extracted from the same paragraph, (ii) a given sentence is extracted from a given paragraph, and (iii) two paragraphs are extracted from the same document.",
83
+ "Our experiments on three public and one industrial AS2 datasets demonstrate the empirical superiority of our pre-trained transformers over baseline models such as RoBERTa and ELECTRA for AS2."
84
+ ]
85
+ },
86
+ {
87
+ "title": "Scim: Intelligent Skimming Support for Scientific Papers",
88
+ "abstract": [
89
+ "Scholars need to keep up with an exponentially increasing flood of scientific papers.",
90
+ "To aid this challenge, we introduce Scim, a novel intelligent interface that helps experienced researchers skim \u2013 or rapidly review \u2013 a paper to attain a cursory understanding of its contents.",
91
+ "Scim supports the skimming process by highlighting salient paper contents in order to direct a reader\u2019s attention.",
92
+ "The system\u2019s highlights are faceted by content type, evenly distributed across a paper, and have a density configurable by readers at both the global and local level.",
93
+ "We evaluate Scim with both an in-lab usability study and a longitudinal diary study, revealing how its highlights facilitate the more efficient construction of a conceptualization of a paper.",
94
+ "We conclude by discussing design considerations and tensions for the design of future intelligent skimming tools."
95
+ ]
96
+ },
97
+ {
98
+ "title": "Cross-Lingual G EN QA: Open-Domain Question Answering with Answer Sentence Generation",
99
+ "abstract": [
100
+ "Recent approaches for question answering systems have achieved impressive performance on English by combining document-level retrieval with answer generation.",
101
+ "These approaches, which we refer to as G EN QA, are able to generate full sentences, effectively answering both factoid and non-factoid questions.",
102
+ "In this paper, we extend G EN QA beyond English and present the \ufb01rst Cross-Lingual answer sentence generation system (C ROSS -L INGUAL G EN QA).",
103
+ "Our system pro-duces natural, full-sentence answers to questions in several languages by exploiting passages written in multiple other languages.",
104
+ "To foster further development on this topic, we introduce G EN -T Y D I QA, an extension of the TyDiQA dataset with well-formed and complete answers for Arabic, Bengali, English, Japanese, and Russian questions.",
105
+ "Using G EN -T Y D I QA, we show that multi-language models outperform monolingual G EN QA in the four non-English languages; for three of them, our C ROSS -L INGUAL G EN QA system achieves the best results."
106
+ ]
107
+ },
108
+ {
109
+ "title": "Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems",
110
+ "abstract": [
111
+ "Large transformer models can highly improve Answer Sentence Selection (AS2) tasks, but their high computational costs prevent their use in many real-world applications.",
112
+ "In this paper, we explore the following research question: How can we make the AS2 models more accurate without significantly increasing their model complexity?",
113
+ "To address the question, we propose a Multiple Heads Student architecture (named CERBERUS), an efficient neural network designed to distill an ensemble of large transformers into a single smaller model.",
114
+ "CERBERUS consists of two components: a stack of transformer layers that is used to encode inputs, and a set of ranking heads; unlike traditional distillation technique, each of them is trained by distilling a different large transformer architecture in a way that preserves the diversity of the ensemble members.",
115
+ "The resulting model captures the knowledge of heterogeneous transformer models by using just a few extra parameters.",
116
+ "We show the effectiveness of CERBERUS on three English datasets for AS2; our proposed approach outperforms all single-model distillations we consider, rivaling the state-of-the-art large AS2 models that have 2.7x more parameters and run 2.5x slower.",
117
+ "Code for our model is available at https://github.com/amazon-research/wqa-cerberus"
118
+ ]
119
+ },
120
+ {
121
+ "title": "Embedding Recycling for Language Models",
122
+ "abstract": [
123
+ "Real-world applications of neural language models often involve running many different models over the same corpus.",
124
+ "The high computational cost of these runs has led to interest in techniques that can reuse the contextualized embeddings produced in previous runs to speed training and inference of future ones.",
125
+ "We refer to this approach as embedding recycling (ER).",
126
+ "While multiple ER techniques have been proposed, their practical effectiveness is still unknown because existing evaluations consider very few models and do not adequately account for overhead costs.",
127
+ "We perform an extensive evaluation of ER across eight different models (17 to 900 million parameters) and fourteen tasks in English.",
128
+ "We show how a simple ER technique that caches activations from an intermediate layer of a pretrained model, and learns task-specific adapters on the later layers, is broadly effective.",
129
+ "For the best-performing baseline in our experiments (DeBERTa-v2 XL), adding a precomputed cache results in a>90% speedup during training and 87-91% speedup for inference, with negligible impact on accuracy.",
130
+ "Our analysis reveals important areas of future work."
131
+ ]
132
+ },
133
+ {
134
+ "title": "Cross-Lingual Open-Domain Question Answering with Answer Sentence Generation",
135
+ "abstract": [
136
+ "Open-Domain Generative Question Answering has achieved impressive performance in English by combining document-level retrieval with answer generation.",
137
+ "These approaches, which we refer to as GenQA, can generate complete sentences, effectively answering both factoid and non-factoid questions.",
138
+ "In this paper, we extend to the multilingual and cross-lingual settings.",
139
+ "For this purpose, we first introduce GenTyDiQA, an extension of the TyDiQA dataset with well-formed and complete answers for Arabic, Bengali, English, Japanese, and Russian.",
140
+ "Based on GenTyDiQA, we design a cross-lingual generative model that produces full-sentence answers by exploiting passages written in multiple languages, including languages different from the question.",
141
+ "Our cross-lingual generative system outperforms answer sentence selection baselines for all 5 languages and monolingual generative pipelines for three out of five languages studied."
142
+ ]
143
+ },
144
+ {
145
+ "title": "Modeling Context in Answer Sentence Selection Systems on a Latency Budget",
146
+ "abstract": [
147
+ "Answer Sentence Selection (AS2) is an efficient approach for the design of open-domain Question Answering (QA) systems.",
148
+ "In order to achieve low latency, traditional AS2 models score question-answer pairs individually, ignoring any information from the document each potential answer was extracted from.",
149
+ "In contrast, more computationally expensive models designed for machine reading comprehension tasks typically receive one or more passages as input, which often results in better accuracy.",
150
+ "In this work, we present an approach to efficiently incorporate contextual information in AS2 models.",
151
+ "For each answer candidate, we first use unsupervised similarity techniques to extract relevant sentences from its source document, which we then feed into an efficient transformer architecture fine-tuned for AS2.",
152
+ "Our best approach, which leverages a multi-way attention architecture to efficiently encode context, improves 6% to 11% over non-contextual state of the art in AS2 with minimal impact on system latency.",
153
+ "All experiments in this work were conducted in English."
154
+ ]
155
+ },
156
+ {
157
+ "title": "Cross-Lingual GenQA: A Language-Agnostic Generative Question Answering Approach for Open-Domain Question Answering",
158
+ "abstract": [
159
+ "Open-Retrieval Generative Question Answering (G EN QA) is proven to deliver high-quality, natural-sounding answers in English.",
160
+ "In this paper, we present the \ufb01rst generalization of the G EN QA approach for the multilingual environment.",
161
+ "To this end, we present the G EN -T Y D I QA dataset, which extends the TyDiQA evaluation data (Clark et al., 2020) with natural-sounding, well-formed answers in Arabic, Bengali, English, Japanese, and Russian.",
162
+ "For all these languages, we show that a G EN QA sequence-to-sequence-based model outperforms a state-of-the-art Answer Sentence Selection model.",
163
+ "We also show that a multilingually-trained model competes with, and in some cases outperforms, its monolingual counterparts.",
164
+ "Finally, we show that our system can even compete with strong baselines, even when fed with information from a variety of languages.",
165
+ "Essentially, our system is able to answer a question in any language of our language set using information from many languages, making it the \ufb01rst Language-Agnostic G EN QA system."
166
+ ]
167
+ },
168
+ {
169
+ "title": "Answer Generation for Retrieval-based Question Answering Systems",
170
+ "abstract": [
171
+ "Recent advancements in transformer-based models have greatly improved the ability of Question Answering (QA) systems to provide correct answers; in particular, answer sentence selection (AS2) models, core components of retrieval-based systems, have achieved impressive results.",
172
+ "While generally effective, these models fail to provide a satisfying answer when all retrieved candidates are of poor quality, even if they contain correct information.",
173
+ "In AS2, models are trained to select the best answer sentence among a set of candidates retrieved for a given question.",
174
+ "In this work, we propose to generate answers from a set of AS2 top candidates.",
175
+ "Rather than selecting the best candidate, we train a sequence to sequence transformer model to generate an answer from a candidate set.",
176
+ "Our tests on three English AS2 datasets show improvement up to 32 absolute points in accuracy over the state of the art."
177
+ ]
178
+ },
179
+ {
180
+ "title": "Don\u2019t Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing",
181
+ "abstract": [
182
+ "Virtual assistants such as Amazon Alexa, Apple Siri, and Google Assistant often rely on a semantic parsing component to understand which action(s) to execute for an utterance spoken by its users.",
183
+ "Traditionally, rule-based or statistical slot-filling systems have been used to parse \u201csimple\u201d queries; that is, queries that contain a single action and can be decomposed into a set of non-overlapping entities.",
184
+ "More recently, shift-reduce parsers have been proposed to process more complex utterances.",
185
+ "These methods, while powerful, impose specific limitations on the type of queries that can be parsed; namely, they require a query to be representable as a parse tree.",
186
+ "In this work, we propose a unified architecture based on Sequence to Sequence models and Pointer Generator Network to handle both simple and complex queries.",
187
+ "Unlike other works, our approach does not impose any restriction on the semantic parse schema.",
188
+ "Furthermore, experiments show that it achieves state of the art performance on three publicly available datasets (ATIS, SNIPS, Facebook TOP), relatively improving between 3.3% and 7.7% in exact match accuracy over previous systems.",
189
+ "Finally, we show the effectiveness of our approach on two internal datasets."
190
+ ]
191
+ },
192
+ {
193
+ "title": "The Cascade Transformer: an Application for Efficient Answer Sentence Selection",
194
+ "abstract": [
195
+ "Large transformer-based language models have been shown to be very effective in many classification tasks.",
196
+ "However, their computational complexity prevents their use in applications requiring the classification of a large set of candidates.",
197
+ "While previous works have investigated approaches to reduce model size, relatively little attention has been paid to techniques to improve batch throughput during inference.",
198
+ "In this paper, we introduce the Cascade Transformer, a simple yet effective technique to adapt transformer-based models into a cascade of rankers.",
199
+ "Each ranker is used to prune a subset of candidates in a batch, thus dramatically increasing throughput at inference time.",
200
+ "Partial encodings from the transformer model are shared among rerankers, providing further speed-up.",
201
+ "When compared to a state-of-the-art transformer model, our approach reduces computation by 37% with almost no impact on accuracy, as measured on two English Question Answering datasets."
202
+ ]
203
+ },
204
+ {
205
+ "title": "Multi-task Learning of Spoken Language Understanding by Integrating N-Best Hypotheses with Hierarchical Attention",
206
+ "abstract": [
207
+ "Currently, in spoken language understanding (SLU) systems, the automatic speech recognition (ASR) module produces multiple interpretations (or hypotheses) for the input audio signal and the natural language understanding (NLU) module takes the one with the highest confidence score for domain or intent classification.",
208
+ "However, the interpretations can be noisy, and solely relying on one interpretation can cause information loss.",
209
+ "To address the problem, many research works attempt to rerank the interpretations for a better choice while some recent works get better performance by integrating all the hypotheses during prediction.",
210
+ "In this paper, we follow the way of integrating hypotheses but strengthen the training mode by involving more tasks, some of which may be not in existing tasks of NLU but relevant, via multi-task learning or transfer learning.",
211
+ "Moreover, we propose the Hierarchical Attention Mechanism (HAM) to further improve the performance with the acoustic-model features like confidence scores, which are ignored in the current hypotheses integration models.",
212
+ "The experimental results show that compared to the standard estimation with one hypothesis, the multi-task learning with HAM can improve the domain and intent classification by relatively 19% and 37%, which are much higher than improvements with current integration or reranking methods.",
213
+ "To illustrate the cause of improvements brought by our model, we decode the hidden representations of some utterance examples and compare the generated texts with hypotheses and transcripts.",
214
+ "The comparison shows that our model could recover the transcription by integrating the fragmented information among hypotheses and identifying the frequent error patterns of the ASR module, and even rewrite the query for a better understanding, which reveals the characteristic of multi-task learning of broadcasting knowledge."
215
+ ]
216
+ },
217
+ {
218
+ "title": "Improving Spoken Language Understanding By Exploiting ASR N-best Hypotheses",
219
+ "abstract": [
220
+ "In a modern spoken language understanding (SLU) system, the natural language understanding (NLU) module takes interpretations of a speech from the automatic speech recognition (ASR) module as the input.",
221
+ "The NLU module usually uses the first best interpretation of a given speech in downstream tasks such as domain and intent classification.",
222
+ "However, the ASR module might misrecognize some speeches and the first best interpretation could be erroneous and noisy.",
223
+ "Solely relying on the first best interpretation could make the performance of downstream tasks non-optimal.",
224
+ "To address this issue, we introduce a series of simple yet efficient models for improving the understanding of semantics of the input speeches by collectively exploiting the n-best speech interpretations from the ASR module."
225
+ ]
226
+ },
227
+ {
228
+ "title": "The Knowledge and Language Gap in Medical Information Seeking",
229
+ "abstract": [
230
+ "Interest in medical information retrieval has risen significantly in the last few years.",
231
+ "The Internet has become a primary source for consumers looking for health information and advice; however, their lack of expertise causes a language and knowledge gap that affects their ability to properly formulate their information needs.",
232
+ "Health experts also struggle to efficiently search the large amount of medical literature available to them, which impacts their ability of integrating the latest research findings in clinical practice.",
233
+ "In this dissertation, I propose several methods to overcome these challenges, thus improving search outcomes."
234
+ ]
235
+ }
236
+ ],
237
+ "user_kps": [
238
+ "attentive user interfaces",
239
+ "bibliometric data",
240
+ "citation network",
241
+ "generating sentences",
242
+ "google scholar",
243
+ "human readers",
244
+ "implicit relevance feedback",
245
+ "information retrieval evaluation",
246
+ "information retrieval models",
247
+ "mobile readers",
248
+ "multi-document summarization",
249
+ "neural ranking models",
250
+ "neural semantic parser",
251
+ "question answering task",
252
+ "question generation",
253
+ "retrieval relevance",
254
+ "scholarly communication",
255
+ "sentence encoders",
256
+ "sentence selection",
257
+ "speech understanding"
258
+ ]
259
+ }
data/users/mccallum/seedset-mccallum-maple.json ADDED
@@ -0,0 +1,264 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "mccallum",
3
+ "s2_authorid": "143753639",
4
+ "papers": [
5
+ {
6
+ "title": "Large Language Model Augmented Narrative Driven Recommendations",
7
+ "abstract": [
8
+ "Narrative-driven recommendation (NDR) presents an information access problem where users solicit recommendations with verbose descriptions of their preferences and context, for example, travelers soliciting recommendations for points of interest while describing their likes/dislikes and travel circumstances.",
9
+ "These requests are increasingly important with the rise of natural language-based conversational interfaces for search and recommendation systems.",
10
+ "However, NDR lacks abundant training data for models, and current platforms commonly do not support these requests.",
11
+ "Fortunately, classical user-item interaction datasets contain rich textual data, e.g., reviews, which often describe user preferences and context \u2013 this may be used to bootstrap training for NDR models.",
12
+ "In this work, we explore using large language models (LLMs) for data augmentation to train NDR models.",
13
+ "We use LLMs for authoring synthetic narrative queries from user-item interactions with few-shot prompting and train retrieval models for NDR on synthetic queries and user-item interaction data.",
14
+ "Our experiments demonstrate that this is an effective strategy for training small-parameter retrieval models that outperform other retrieval and LLM baselines for narrative-driven recommendation."
15
+ ]
16
+ },
17
+ {
18
+ "title": "Answering Compositional Queries with Set-Theoretic Embeddings",
19
+ "abstract": [
20
+ "The need to compactly and robustly represent item-attribute relations arises in many important tasks, such as faceted browsing and recommendation systems.",
21
+ "A popular machine learning approach for this task denotes that an item has an attribute by a high dot-product between vectors for the item and attribute -- a representation that is not only dense, but also tends to correct noisy and incomplete data.",
22
+ "While this method works well for queries retrieving items by a single attribute (such as \\emph{movies that are comedies}), we find that vector embeddings do not so accurately support compositional queries (such as movies that are comedies and British but not romances).",
23
+ "To address these set-theoretic compositions, this paper proposes to replace vectors with box embeddings, a region-based representation that can be thought of as learnable Venn diagrams.",
24
+ "We introduce a new benchmark dataset for compositional queries, and present experiments and analysis providing insights into the behavior of both.",
25
+ "We find that, while vector and box embeddings are equally suited to single attribute queries, for compositional queries box embeddings provide substantial advantages over vectors, particularly at the moderate and larger retrieval set sizes that are most useful for users' search and browsing."
26
+ ]
27
+ },
28
+ {
29
+ "title": "Revisiting the Architectures like Pointer Networks to Efficiently Improve the Next Word Distribution, Summarization Factuality, and Beyond",
30
+ "abstract": [
31
+ "Is the output softmax layer, which is adopted by most language models (LMs), always the best way to compute the next word probability?",
32
+ "Given so many attention layers in a modern transformer-based LM, are the pointer networks redundant nowadays?",
33
+ "In this study, we discover that the answers to both questions are no. This is because the softmax bottleneck sometimes prevents the LMs from predicting the desired distribution and the pointer networks can be used to break the bottleneck efficiently.",
34
+ "Based on the finding, we propose several softmax alternatives by simplifying the pointer networks and accelerating the word-by-word rerankers.",
35
+ "In GPT-2, our proposals are significantly better and more efficient than mixture of softmax, a state-of-the-art softmax alternative.",
36
+ "In summarization experiments, without significantly decreasing its training/testing speed, our best method based on T5-Small improves factCC score by 2 points in CNN/DM and XSUM dataset, and improves MAUVE scores by 30% in BookSum paragraph-level dataset."
37
+ ]
38
+ },
39
+ {
40
+ "title": "Editable User Profiles for Controllable Text Recommendations",
41
+ "abstract": [
42
+ "Methods for making high-quality recommendations often rely on learning latent representations from interaction data.",
43
+ "These methods, while performant, do not provide ready mechanisms for users to control the recommendation they receive.",
44
+ "Our work tackles this problem by proposing LACE, a novel concept value bottleneck model for controllable text recommendations.",
45
+ "LACE represents each user with a succinct set of human-readable concepts through retrieval given user-interacted documents and learns personalized representations of the concepts based on user documents.",
46
+ "This concept based user profile is then leveraged to make recommendations.",
47
+ "The design of our model affords control over the recommendations through a number of intuitive interactions with a transparent user profile.",
48
+ "We first establish the quality of recommendations obtained from LACE in an offline evaluation on three recommendation tasks spanning six datasets in warm-start, cold-start, and zero-shot setups.",
49
+ "Next, we validate the controllability of LACE under simulated user interactions.",
50
+ "Finally, we implement LACE in an interactive controllable recommender system and conduct a user study to demonstrate that users are able to improve the quality of recommendations they receive through interactions with an editable user profile."
51
+ ]
52
+ },
53
+ {
54
+ "title": "Adaptive Selection of Anchor Items for CUR-based k-NN search with Cross-Encoders",
55
+ "abstract": [
56
+ "Cross-encoder models, which jointly encode and score a query-item pair, are typically prohibitively expensive for k -nearest neighbor search.",
57
+ "Consequently, k -NN search is performed not with a cross-encoder, but with a heuristic retrieve (e.g., using BM25 or dual-encoder) and re-rank approach.",
58
+ "Recent work proposes ANN CUR (Yadav et al., 2022) which uses CUR matrix factorization to produce an embedding space for ef\ufb01cient vector-based search that directly approximates the cross-encoder without the need for dual-encoders.",
59
+ "ANN CUR de\ufb01nes this shared query-item embedding space by scoring the test query against anchor items which are sampled uniformly at random.",
60
+ "While this minimizes average approximation error over all items, unsuitably high approximation error on top-k items remains and leads to poor recall of top-k (and especially top-1) items.",
61
+ "Increasing the number of anchor items is a straightforward way of improving the approximation error and hence k - NN recall of ANN CUR but at the cost of increased inference latency.",
62
+ "In this paper, we propose a new method for adaptively choosing anchor items that minimizes the approximation error for the practically important top-k neighbors for a query with minimal computational overhead.",
63
+ "Our proposed method incrementally selects a suitable set of anchor items for a given test query over several rounds, us-ing anchors chosen in previous rounds to inform selection of more anchor items.",
64
+ "Empiri-cally, our method consistently improves k -NN recall as compared to both ANN CUR and the widely-used dual-encoder-based retrieve-and-rerank approach."
65
+ ]
66
+ },
67
+ {
68
+ "title": "Machine Reading Comprehension using Case-based Reasoning",
69
+ "abstract": [
70
+ "We present an accurate and interpretable method for answer extraction in machine reading comprehension that is reminiscent of case-based reasoning (CBR) from classical AI.",
71
+ "Our method (CBR-MRC) builds upon the hypothesis that contextualized answers to similar questions share semantic similarities with each other.",
72
+ "Given a test question, CBR-MRC first retrieves a set of similar cases from a non-parametric memory and then predicts an answer by selecting the span in the test context that is most similar to the contextualized representations of answers in the retrieved cases.",
73
+ "The semi-parametric nature of our approach allows it to attribute a prediction to the specific set of evidence cases, making it a desirable choice for building reliable and debuggable QA systems.",
74
+ "We show that CBR-MRC provides high accuracy comparable with large reader models and outperforms baselines by 11.5 and 8.4 EM on NaturalQuestions and NewsQA, respectively.",
75
+ "Further, we demonstrate the ability of CBR-MRC in identifying not just the correct answer tokens but also the span with the most relevant supporting evidence.",
76
+ "Lastly, we observe that contexts for certain question types show higher lexical diversity than others and find that CBR-MRC is robust to these variations while performance using fully-parametric methods drops."
77
+ ]
78
+ },
79
+ {
80
+ "title": "Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining",
81
+ "abstract": [
82
+ "Dual encoder models are ubiquitous in modern classification and retrieval.",
83
+ "Crucial for training such dual encoders is an accurate estimation of gradients from the partition function of the softmax over the large output space; this requires finding negative targets that contribute most significantly (\"hard negatives\").",
84
+ "Since dual encoder model parameters change during training, the use of traditional static nearest neighbor indexes can be sub-optimal.",
85
+ "These static indexes (1) periodically require expensive re-building of the index, which in turn requires (2) expensive re-encoding of all targets using updated model parameters.",
86
+ "This paper addresses both of these challenges.",
87
+ "First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree.",
88
+ "Second, we approximate the effect of a gradient update on target encodings with an efficient Nystrom low-rank approximation.",
89
+ "In our empirical study on datasets with over twenty million targets, our approach cuts error by half in relation to oracle brute-force negative mining.",
90
+ "Furthermore, our method surpasses prior state-of-the-art while using 150x less accelerator memory."
91
+ ]
92
+ },
93
+ {
94
+ "title": "KwikBucks: Correlation Clustering with Cheap-Weak and Expensive-Strong Signals",
95
+ "abstract": [
96
+ ","
97
+ ]
98
+ },
99
+ {
100
+ "title": "Low-Resource Compositional Semantic Parsing with Concept Pretraining",
101
+ "abstract": [
102
+ "Semantic parsing plays a key role in digital voice assistants such as Alexa, Siri, and Google Assistant by mapping natural language to structured meaning representations.",
103
+ "When we want to improve the capabilities of a voice assistant by adding a new domain, the underlying semantic parsing model needs to be retrained using thousands of annotated examples from the new domain, which is time-consuming and expensive.",
104
+ "In this work, we present an architecture to perform such domain adaptation automatically, with only a small amount of metadata about the new domain and without any new training data (zero-shot) or with very few examples (few-shot).",
105
+ "We use a base seq2seq (sequence-to-sequence) architecture and augment it with a concept encoder that encodes intent and slot tags from the new domain.",
106
+ "We also introduce a novel decoder-focused approach to pretrain seq2seq models to be concept aware using Wikidata and use it to help our model learn important concepts and perform well in low-resource settings.",
107
+ "We report few-shot and zero-shot results for compositional semantic parsing on the TOPv2 dataset and show that our model outperforms prior approaches in few-shot settings for the TOPv2 and SNIPS datasets."
108
+ ]
109
+ },
110
+ {
111
+ "title": "Drug Repositioning using Consilience of Knowledge Graph Completion Methods",
112
+ "abstract": [
113
+ "Motivation While link prediction methods in knowledge graphs have been increasingly utilized to locate potential associations between compounds and diseases, they suffer from lack of sufficient evidence to explain why a drug and a disease may be indicated.",
114
+ "This is especially true for knowledge graph embedding (KGE) based methods where a drug-disease indication is linked only by information gleaned from a vector representation.",
115
+ "Complementary pathwalking algorithms can increase the confidence of drug repositioning candidates by traversing a knowledge graph.",
116
+ "However, these methods heavily weigh the relatedness of drugs, through their targets, pharmacology or shared diseases.",
117
+ "Furthermore, these methods rely on arbitrarily extracted paths as evidence of a compound to disease indication and lack the ability to make predictions on rare diseases.",
118
+ "Results In this paper, we evaluate seven link prediction methods on a vast biomedical knowledge graph for drug repositioning.",
119
+ "We follow the principle of consilience, and combine the reasoning paths and predictions provided by path-based and KGE methods to not only demonstrate a significant ranking performance improvement but also identify putative drug repositioning indications.",
120
+ "Finally, we highlight the utility of our approach through a potential repositioning indication.",
121
+ "Availability The MIND dataset can be found at 10.5281/zenodo.8117748.",
122
+ "The python code to reproduce the entirety of this analysis can be found at https://github.com/SuLab/{KnowledgeGraphEmbedding, CBRonMRN}. Contact Andrew I. Su at asu@scripps.edu Supplementary information Supplementary data are available at The Journal Title online."
123
+ ]
124
+ },
125
+ {
126
+ "title": "Longtonotes: OntoNotes with Longer Coreference Chains",
127
+ "abstract": [
128
+ "Ontonotes has served as the most important benchmark for coreference resolution.",
129
+ "However, for ease of annotation, several long documents in Ontonotes were split into smaller parts.",
130
+ "In this work, we build a corpus of coreference-annotated documents of significantly longer length than what is currently available.",
131
+ "We do so by providing an accurate, manually-curated, merging of annotations from documents that were split into multiple parts in the original Ontonotes annotation process.",
132
+ "The resulting corpus, which we call LongtoNotes contains documents in multiple genres of the English language with varying lengths, the longest of which are up to 8x the length of documents in Ontonotes, and 2x those in Litbank.",
133
+ "We evaluate state-of-the-art neural coreference systems on this new corpus, analyze the relationships between model architectures/hyperparameters and document length on performance and efficiency of the models, and demonstrate areas of improvement in long-document coreference modelling revealed by our new corpus."
134
+ ]
135
+ },
136
+ {
137
+ "title": "Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization",
138
+ "abstract": [
139
+ "Efficient k-nearest neighbor search is a fundamental task, foundational for many problems in NLP.",
140
+ "When the similarity is measured by dot-product between dual-encoder vectors or L2-distance, there already exist many scalable and efficient search methods.",
141
+ "But not so when similarity is measured by more accurate and expensive black-box neural similarity models, such as cross-encoders, which jointly encode the query and candidate neighbor.",
142
+ "The cross-encoders\u2019 high computational cost typically limits their use to reranking candidates retrieved by a cheaper model, such as dual encoder or TF-IDF.",
143
+ "However, the accuracy of such a two-stage approach is upper-bounded by the recall of the initial candidate set, and potentially requires additional training to align the auxiliary retrieval model with the cross-encoder model.",
144
+ "In this paper, we present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder.",
145
+ "Retrieval is made efficient with CUR decomposition, a matrix decomposition approach that approximates all pairwise cross-encoder distances from a small subset of rows and columns of the distance matrix.",
146
+ "Indexing items using our approach is computationally cheaper than training an auxiliary dual-encoder model through distillation.",
147
+ "Empirically, for k > 10, our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods that re-rank items retrieved using a dual-encoder or TF-IDF."
148
+ ]
149
+ },
150
+ {
151
+ "title": "An Evaluative Measure of Clustering Methods Incorporating Hyperparameter Sensitivity",
152
+ "abstract": [
153
+ "Clustering algorithms are often evaluated using metrics which compare with ground-truth cluster assignments, such as Rand index and NMI.",
154
+ "Algorithm performance may vary widely for different hyperparameters, however, and thus model selection based on optimal performance for these metrics is discordant with how these algorithms are applied in practice, where labels are unavailable and tuning is often more art than science.",
155
+ "It is therefore desirable to compare clustering algorithms not only on their optimally tuned performance, but also some notion of how realistic it would be to obtain this performance in practice.",
156
+ "We propose an evaluation of clustering methods capturing this ease-of-tuning by modeling the expected best clustering score under a given computation budget.",
157
+ "To encourage the adoption of the proposed metric alongside classic clustering evaluations, we provide an extensible benchmarking framework.",
158
+ "We perform an extensive empirical evaluation of our proposed metric on popular clustering algorithms over a large collection of datasets from different domains, and observe that our new metric leads to several noteworthy observations."
159
+ ]
160
+ },
161
+ {
162
+ "title": "Enhanced Distant Supervision with State-Change Information for Relation Extraction",
163
+ "abstract": [
164
+ "In this work, we introduce a method for enhancing distant supervision with state-change information for relation extraction.",
165
+ "We provide a training dataset created via this process, along with manually annotated development and test sets.",
166
+ "We present an analysis of the curation process and data, and compare it to standard distant supervision.",
167
+ "We demonstrate that the addition of state-change information reduces noise when used for static relation extraction, and can also be used to train a relation-extraction system that detects a change of state in relations."
168
+ ]
169
+ },
170
+ {
171
+ "title": "Modeling Label Space Interactions in Multi-label Classification using Box Embeddings",
172
+ "abstract": [
173
+ "Multi-label classi\ufb01cation is a challenging structured prediction task in which a set of output class labels are predicted for each input.",
174
+ "Real-world datasets often have natural or latent taxonomic relationships between labels, making it desirable for models to employ label representations capable of capturing such taxonomies.",
175
+ "Most existing multi-label classi\ufb01cation methods do not do so, resulting in label predictions that are inconsistent with the taxonomic constraints, thus failing to accurately represent the fundamentals of problem setting.",
176
+ "In this work we introduce the multi-label box model (MBM), a multi-label classi\ufb01cation method that combines the encoding power of neural networks with the inductive bias and probabilistic semantics of box embeddings (Vilnis, et al 2018).",
177
+ "Box embeddings can be under-stood as trainable Venn-diagrams based on hyper-rectangles.",
178
+ "Representing labels by boxes rather than vectors, MBM is able to capture taxonomic relations among labels.",
179
+ "Furthermore, since box embeddings allow these relations to be learned by stochastic gradient descent from data, and to be read as calibrated conditional probabilities, our model is endowed with a high degree of interpretability.",
180
+ "This interpretability also facilitates the injection of partial information about label-label relationships into model training, to further improve its consistency.",
181
+ "We provide theoretical grounding for our method and show experimentally the model\u2019s ability to learn the true latent taxonomic structure from data.",
182
+ "Through extensive empirical evaluations on both small and large-scale multi-label classi\ufb01cation datasets, we show that BBM can signi\ufb01cantly improve taxonomic consistency while preserving or surpassing state-of-the-art predictive performance."
183
+ ]
184
+ },
185
+ {
186
+ "title": "CBR-iKB: A Case-Based Reasoning Approach for Question Answering over Incomplete Knowledge Bases",
187
+ "abstract": [
188
+ "Knowledge bases (KBs) are often incomplete and constantly changing in practice.",
189
+ "Yet, in many question answering applications coupled with knowledge bases, the sparse nature of KBs is often overlooked.",
190
+ "To this end, we propose a case-based reasoning approach, CBR-iKB, for knowledge base question answering (KBQA) with incomplete-KB as our main focus.",
191
+ "Our method ensembles decisions from multiple reasoning chains with a novel nonparametric reasoning algorithm.",
192
+ "By design, CBR-iKB can seamlessly adapt to changes in KBs without any task-specific training or fine-tuning.",
193
+ "Our method achieves 100% accuracy on MetaQA and establishes new state-of-the-art on multiple benchmarks.",
194
+ "For instance, CBR-iKB achieves an accuracy of 70% on WebQSP under the incomplete-KB setting, outperforming the existing state-of-the-art method by 22.3%."
195
+ ]
196
+ },
197
+ {
198
+ "title": "Softmax Bottleneck Makes Language Models Unable to Represent Multi-mode Word Distributions",
199
+ "abstract": [
200
+ "Neural language models (LMs) such as GPT-2 estimate the probability distribution over the next word by a softmax over the vocabulary.",
201
+ "The softmax layer produces the distribution based on the dot products of a single hidden state and the embeddings of words in the vocabulary.",
202
+ "However, we discover that this single hidden state cannot produce all probability distributions regardless of the LM size or training data size because the single hidden state embedding cannot be close to the embeddings of all the possible next words simultaneously when there are other interfering word embeddings between them.",
203
+ "In this work, we demonstrate the importance of this limitation both theoretically and practically.",
204
+ "Our work not only deepens our understanding of softmax bottleneck and mixture of softmax (MoS) but also inspires us to propose multi-facet softmax (MFS) to address the limitations of MoS. Extensive empirical analyses confirm our findings and show that against MoS, the proposed MFS achieves two-fold improvements in the perplexity of GPT-2 and BERT."
205
+ ]
206
+ },
207
+ {
208
+ "title": "Inducing and Using Alignments for Transition-based AMR Parsing",
209
+ "abstract": [
210
+ "Transition-based parsers for Abstract Meaning Representation (AMR) rely on node-to-word alignments.",
211
+ "These alignments are learned separately from parser training and require a complex pipeline of rule-based components, pre-processing, and post-processing to satisfy domain-specific constraints.",
212
+ "Parsers also train on a point-estimate of the alignment pipeline, neglecting the uncertainty due to the inherent ambiguity of alignment.",
213
+ "In this work we explore two avenues for overcoming these limitations.",
214
+ "First, we propose a neural aligner for AMR that learns node-to-word alignments without relying on complex pipelines.",
215
+ "We subsequently explore a tighter integration of aligner and parser training by considering a distribution over oracle action sequences arising from aligner uncertainty.",
216
+ "Empirical results show this approach leads to more accurate alignments and generalization better from the AMR2.0 to AMR3.0 corpora.",
217
+ "We attain a new state-of-the art for gold-only trained models, matching silver-trained performance without the need for beam search on AMR3.0."
218
+ ]
219
+ },
220
+ {
221
+ "title": "Augmenting Scientific Creativity with Retrieval across Knowledge Domains",
222
+ "abstract": [
223
+ "Exposure to ideas in domains outside a scientist's own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas.",
224
+ "While improved performance in scholarly search engines can help scientists efficiently identify relevant advances in domains they may already be familiar with, it may fall short of helping them explore diverse ideas \\textit{outside} such domains.",
225
+ "In this paper we explore the design of systems aimed at augmenting the end-user ability in cross-domain exploration with flexible query specification.",
226
+ "To this end, we develop an exploratory search system in which end-users can select a portion of text core to their interest from a paper abstract and retrieve papers that have a high similarity to the user-selected core aspect but differ in terms of domains.",
227
+ "Furthermore, end-users can `zoom in' to specific domain clusters to retrieve more papers from them and understand nuanced differences within the clusters.",
228
+ "Our case studies with scientists uncover opportunities and design implications for systems aimed at facilitating cross-domain exploration and inspiration."
229
+ ]
230
+ },
231
+ {
232
+ "title": "Unsupervised Partial Sentence Matching for Cited Text Identification",
233
+ "abstract": [
234
+ "Given a citation in the body of a research paper, cited text identification aims to find the sentences in the cited paper that are most relevant to the citing sentence.",
235
+ "The task is fundamentally one of sentence matching, where affinity is often assessed by a cosine similarity between sentence embeddings.",
236
+ "However, (a) sentences may not be well-represented by a single embedding because they contain multiple distinct semantic aspects, and (b) good matches may not require a strong match in all aspects.",
237
+ "To overcome these limitations, we propose a simple and efficient unsupervised method for cited text identification that adapts an asymmetric similarity measure to allow partial matches of multiple aspects in both sentences.",
238
+ "On the CL-SciSumm dataset we find that our method outperforms a baseline symmetric approach, and, surprisingly, also outperforms all supervised and unsupervised systems submitted to past editions of CL-SciSumm Shared Task 1a."
239
+ ]
240
+ }
241
+ ],
242
+ "user_kps": [
243
+ "clustering models",
244
+ "computational drug repositioning",
245
+ "content-based recommenders",
246
+ "convolutive non-negative matrix factorization",
247
+ "cross-document coreference resolution",
248
+ "distant supervision",
249
+ "drug-target interaction prediction",
250
+ "exploratory searches",
251
+ "factored language models",
252
+ "item embeddings",
253
+ "machine reading comprehension",
254
+ "multi-label classification",
255
+ "multilabel learning",
256
+ "neural language model",
257
+ "neural semantic parser",
258
+ "paraphrastic sentence embeddings",
259
+ "ranked retrieval",
260
+ "recurrent neural network language model",
261
+ "representation learning",
262
+ "similarity based retrieval"
263
+ ]
264
+ }
data/users/nmahyar/seedset-nmahyar-maple.json ADDED
@@ -0,0 +1,276 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "nmahyar",
3
+ "s2_authorid": "1936892",
4
+ "papers": [
5
+ {
6
+ "title": "Supporting Serendipitous Discovery and Balanced Analysis of Online Product Reviews with Interaction-Driven Metrics and Bias-Mitigating Suggestions",
7
+ "abstract": [
8
+ "In this study, we investigate how supporting serendipitous discovery and analysis of online product reviews can encourage readers to explore reviews more comprehensively prior to making purchase decisions.",
9
+ "We propose two interventions \u2014 Exploration Metrics that can help readers understand and track their exploration patterns through visual indicators and a Bias Mitigation Model that intends to maximize knowledge discovery by suggesting sentiment and semantically diverse reviews.",
10
+ "We designed, developed, and evaluated a text analytics system called Serendyze, where we integrated these interventions.",
11
+ "We asked 100 crowd workers to use Serendyze to make purchase decisions based on product reviews.",
12
+ "Our evaluation suggests that exploration metrics enabled readers to efficiently cover more reviews in a balanced way, and suggestions from the bias mitigation model influenced readers to make confident data-driven decisions.",
13
+ "We discuss the role of user agency and trust in text-level analysis systems and their applicability in domains beyond review exploration."
14
+ ]
15
+ },
16
+ {
17
+ "title": "Scientometric Analysis of Interdisciplinary Collaboration and Gender Trends in 30 Years of IEEE VIS Publications",
18
+ "abstract": [
19
+ "We present the results of a scientometric analysis of 30 years of IEEE VIS publications between 1990-2020, in which we conducted a multifaceted analysis of interdisciplinary collaboration and gender composition among authors.",
20
+ "To this end, we curated BiblioVIS, a bibliometric dataset that contains rich metadata about IEEE VIS publications, including 3032 papers and 6113 authors.",
21
+ "One of the main factors differentiating BiblioVIS from similar datasets is the authors' gender and discipline data, which we inferred through iterative rounds of computational and manual processes.",
22
+ "Our analysis shows that, by and large, inter-institutional and interdisciplinary collaboration has been steadily growing over the past 30 years.",
23
+ "However, interdisciplinary research was mainly between a few fields, including Computer Science, Engineering and Technology, and Medicine and Health disciplines.",
24
+ "Our analysis of gender shows steady growth in women's authorship.",
25
+ "Despite this growth, the gender distribution is still highly skewed, with men dominating (~75%) of this space.",
26
+ "Our predictive analysis of gender balance shows that if the current trends continue, gender parity in the visualization field will not be reached before the third quarter of the century (~2070).",
27
+ "Our primary goal in this work is to call the visualization community's attention to the critical topics of collaboration, diversity, and gender.",
28
+ "Our research offers critical insights through the lens of diversity and gender to help accelerate progress towards a more diverse and representative research community."
29
+ ]
30
+ },
31
+ {
32
+ "title": "Characterizing Uncertainty in the Visual Text Analysis Pipeline",
33
+ "abstract": [
34
+ "Current visual text analysis approaches rely on sophisticated processing pipelines.",
35
+ "Each step of such a pipeline potentially amplifies any uncertainties from the previous step.",
36
+ "To ensure the comprehensibility and interoperability of the results, it is of paramount importance to clearly communicate the uncertainty not only of the output but also within the pipeline.",
37
+ "In this paper, we characterize the sources of uncertainty along the visual text analysis pipeline.",
38
+ "Within its three phases of labeling, modeling, and analysis, we identify six sources, discuss the type of uncertainty they create, and how they propagate.",
39
+ "The goal of this paper is to bring the attention of the visualization community to additional types and sources of uncertainty in visual text analysis and to call for careful consideration, highlighting opportunities for future research."
40
+ ]
41
+ },
42
+ {
43
+ "title": "Of Course it's Political! A Critical Inquiry into Underemphasized Dimensions in Civic Text Visualization",
44
+ "abstract": [
45
+ "Recent developments in critical information visualization have brought the field's attention to political, feminist, ethical, and rhetorical aspects of data visualization.",
46
+ "However, less work has explored the interplay between design decisions and political ramifications\u2014structures of authority, means of representation, etc.",
47
+ "In this paper, we build upon these critical perspectives and highlight the political aspect of civic text visualization especially in the context of democratic decision\u2010making.",
48
+ "Based on a critical analysis of survey papers about text visualization in general, followed by a review on the status quo of text visualization in civics, we argue that civic text visualization inherits an exclusively analytic framing.",
49
+ "This framing leads to a series of issues and challenges in the fundamentally political context of civics, such as misinterpretation of data, missing minority voices, and excluding the public from decision making processes.",
50
+ "To span this gap between political context and analytic framing, we provide a series of two\u2010pole conceptual dimensions, such as from singular user to multiple relationships, and from complexity to inclusivity of visualization design.",
51
+ "For each dimension, we discuss how the tensions between these poles can help surface the political ramifications of design decisions in civic text visualization.",
52
+ "These dimensions can thus help visualization researchers, designers, and practitioners attend more intentionally to these political aspects and inspire their design choices.",
53
+ "We conclude by suggesting that these dimensions may be useful for visualization design across a variety of application domains, beyond civic text visualization."
54
+ ]
55
+ },
56
+ {
57
+ "title": "Designing With Pictographs: Envision Topics Without Sacrificing Understanding",
58
+ "abstract": [
59
+ "Past studies have shown that when a visualization uses pictographs to encode data, they have a positive effect on memory, engagement, and assessment of risk.",
60
+ "However, little is known about how pictographs affect one\u2019s ability to understand a visualization, beyond memory for values and trends.",
61
+ "We conducted two crowdsourced experiments to compare the effectiveness of using pictographs when showing part-to-whole relationships.",
62
+ "In Experiment 1, we compared pictograph arrays to more traditional bar and pie charts.",
63
+ "We tested participants\u2019 ability to generate high-level insights following Bloom\u2019s taxonomy of educational objectives via 6 free-response questions.",
64
+ "We found that accuracy for extracting information and generating insights did not differ overall between the two versions.",
65
+ "To explore the motivating differences between the designs, we conducted a second experiment where participants compared charts containing pictograph arrays to more traditional charts on 5 metrics and explained their reasoning.",
66
+ "We found that some participants preferred the way that pictographs allowed them to envision the topic more easily, while others preferred traditional bar and pie charts because they seem less cluttered and faster to read.",
67
+ "These results suggest that, at least in simple visualizations depicting part-to-whole relationships, the choice of using pictographs has little influence on sensemaking and insight extraction.",
68
+ "When deciding whether to use pictograph arrays, designers should consider visual appeal, perceived comprehension time, ease of envisioning the topic, and clutteredness."
69
+ ]
70
+ },
71
+ {
72
+ "title": "An Interdisciplinary Perspective on Evaluation and Experimental Design for Visual Text Analytics: Position Paper",
73
+ "abstract": [
74
+ "Appropriate evaluation and experimental design are fundamental for empirical sciences, particularly in data-driven fields.",
75
+ "Due to the successes in computational modeling of languages, for instance, research outcomes are having an increasingly immediate impact on end users.",
76
+ "As the gap in adoption by end users decreases, the need increases to ensure that tools and models developed by the research communities and practitioners are reliable, trustworthy, and supportive of the users in their goals.",
77
+ "In this position paper, we focus on the issues of evaluating visual text analytics approaches.",
78
+ "We take an interdisciplinary perspective from the visualization and natural language processing communities, as we argue that the design and validation of visual text analytics include concerns beyond computational or visual/interactive methods on their own.",
79
+ "We identify four key groups of challenges for evaluating visual text analytics approaches (data ambiguity, experimental design, user trust, and \"big picture\" concerns) and provide suggestions for research opportunities from an interdisciplinary perspective."
80
+ ]
81
+ },
82
+ {
83
+ "title": "ClioQuery: Interactive Query-oriented Text Analytics for Comprehensive Investigation of Historical News Archives",
84
+ "abstract": [
85
+ "Historians and archivists often find and analyze the occurrences of query words in newspaper archives to help answer fundamental questions about society.",
86
+ "But much work in text analytics focuses on helping people investigate other textual units, such as events, clusters, ranked documents, entity relationships, or thematic hierarchies.",
87
+ "Informed by a study into the needs of historians and archivists, we thus propose ClioQuery, a text analytics system uniquely organized around the analysis of query words in context.",
88
+ "ClioQuery applies text simplification techniques from natural language processing to help historians quickly and comprehensively gather and analyze all occurrences of a query word across an archive.",
89
+ "It also pairs these new NLP methods with more traditional features like linked views and in-text highlighting to help engender trust in summarization techniques.",
90
+ "We evaluate ClioQuery with two separate user studies, in which historians explain how ClioQuery\u2019s novel text simplification features can help facilitate historical research.",
91
+ "We also evaluate with a separate quantitative comparison study, which shows that ClioQuery helps crowdworkers find and remember historical information.",
92
+ "Such results suggest possible new directions for text analytics in other query-oriented settings."
93
+ ]
94
+ },
95
+ {
96
+ "title": "CommunityPulse: Facilitating Community Input Analysis by Surfacing Hidden Insights, Reflections, and Priorities",
97
+ "abstract": [
98
+ "Increased access to online engagement platforms has created a shift in civic practice, enabling civic leaders to broaden their outreach to collect a larger number of community input, such as comments and ideas.",
99
+ "However, sensemaking of such input remains a challenge due to the unstructured nature of text comments and ambiguity of human language.",
100
+ "Hence, community input is often left unanalyzed and unutilized in policymaking.",
101
+ "To address this problem, we interviewed 14 civic leaders to understand their practices and requirements.",
102
+ "We identified challenges around organizing the unstructured community input and surfacing community\u2019s reflections beyond binary sentiments.",
103
+ "Based on these insights, we built CommunityPulse, an interactive system that combines text analysis and visualization to scaffold different facets of community input.",
104
+ "Our evaluation with another 15 experts suggests CommunityPulse\u2019s efficacy in surfacing multiple facets such as reflections, priorities, and hidden insights while reducing the required time, effort, and expertise for community input analysis."
105
+ ]
106
+ },
107
+ {
108
+ "title": "Making the Invisible Visible: Risks and Benefits of Disclosing Metadata in Visualization",
109
+ "abstract": [
110
+ "Accompanying a data visualization with metadata may benefit readers by facilitating content understanding, strengthening trust, and providing accountability.",
111
+ "However, providing this kind of information may also have negative, unintended consequences, such as biasing readers\u2019 interpretations, a loss of trust as a result of too much transparency, and the possibility of opening visualization creators with minoritized identities up to undeserved critique.",
112
+ "To help future visualization researchers and practitioners decide what kinds of metadata to include, we discuss some of the potential benefits and risks of disclosing five kinds of metadata: metadata about the source of the underlying data; the cleaning and processing conducted; the marks, channels, and other design elements used; the people who directly created the visualization; and the people for whom the visualization was created.",
113
+ "We conclude by proposing a few open research questions related to how to communicate metadata about visualizations."
114
+ ]
115
+ },
116
+ {
117
+ "title": "RisingEMOTIONS: Bridging Art and Technology to Visualize Public\u2019s Emotions about Climate Change",
118
+ "abstract": [
119
+ "In response to the threat posed by sea-level rise, coastal cities must rapidly adapt and transform vulnerable areas to protect endangered communities.",
120
+ "As such, raising awareness and engaging affected communities in planning for adaptation strategies is critical.",
121
+ "However, in the US, public engagement with climate change is low, especially among underrepresented populations.",
122
+ "To address this challenge, we designed and implemented RisingEMOTIONS, a site-specific collaborative art installation situated in East Boston that combines public art with digital technology.",
123
+ "The installation depicts the impacts of sea-level rise by visualizing local projected flood levels and the public\u2019s emotions toward this threat.",
124
+ "The community\u2019s engagement with our project demonstrated the potential for public art to create interest and raise awareness of climate change.",
125
+ "We discuss the potential for continued growth in the way that digital tools and public art can support equitable resilience planning through increased public engagement."
126
+ ]
127
+ },
128
+ {
129
+ "title": "Impact of the COVID-19 Pandemic on the Academic Community Results from a survey conducted at University of Massachusetts Amherst",
130
+ "abstract": [
131
+ "The COVID-19 pandemic has significantly impacted academic life in the United States and beyond.",
132
+ "To gain a better understanding of its impact on the academic community, we conducted a large-scale survey at the University of Massachusetts Amherst.",
133
+ "We collected multifaceted data from students, staff, and faculty on several aspects of their lives, such as mental and physical health, productivity, and finances.",
134
+ "All our respondents expressed mental and physical issues and concerns, such as increased stress and depression levels.",
135
+ "Financial difficulties seem to have the most considerable toll on staff and undergraduate students, while productivity challenges were mostly expressed by faculty and graduate students.",
136
+ "As universities face many important decisions with respect to mitigating the effects of this pandemic, we present our findings with the intent of shedding light on the challenges faced by various academic groups in the face of the pandemic, calling attention to the differences between groups.",
137
+ "We also contribute a discussion highlighting how the results translate to policies for the effective and timely support of the categories of respondents who need them most.",
138
+ "Finally, the survey itself, which includes conditional logic allowing for personalized questions, serves as a template for further data collection, facilitating a comparison of the impact on campuses across the United States."
139
+ ]
140
+ },
141
+ {
142
+ "title": "A Framework for Open Civic Design: Integrating Public Participation, Crowdsourcing, and Design Thinking",
143
+ "abstract": [
144
+ "Civic problems are often too complex to solve through traditional top-down strategies.",
145
+ "Various governments and civic initiatives have explored more community-driven strategies where citizens get involved with defining problems and innovating solutions.",
146
+ "While certain people may feel more empowered, the public at large often does not have accessible, flexible, and meaningful ways to engage.",
147
+ "Prior theoretical frameworks for public participation typically offer a one-size-fits-all model based on face-to-face engagement and fail to recognize the barriers faced by even the most engaged citizens.",
148
+ "In this article, we explore a vision for open civic design where we integrate theoretical frameworks from public engagement, crowdsourcing, and design thinking to consider the role technology can play in lowering barriers to large-scale participation, scaffolding problem-solving activities, and providing flexible options that cater to individuals\u2019 skills, availability, and interests.",
149
+ "We describe our novel theoretical framework and analyze the key goals associated with this vision: (1) to promote inclusive and sustained participation in civics; (2) to facilitate effective management of large-scale participation; and (3) to provide a structured process for achieving effective solutions.",
150
+ "We present case studies of existing civic design initiatives and discuss challenges, limitations, and future work related to operationalizing, implementing, and testing this framework."
151
+ ]
152
+ },
153
+ {
154
+ "title": "How to evaluate data visualizations across different levels of understanding",
155
+ "abstract": [
156
+ "Understanding a visualization is a multi-level process.",
157
+ "A reader must extract and extrapolate from numeric facts, understand how those facts apply to both the context of the data and other potential contexts, and draw or evaluate conclusions from the data.",
158
+ "A well-designed visualization should support each of these levels of understanding.",
159
+ "We diagnose levels of understanding of visualized data by adapting Bloom\u2019s taxonomy, a common framework from the education literature.",
160
+ "We describe each level of the framework and provide examples for how it can be applied to evaluate the efficacy of data visualizations along six levels of knowledge acquisition - knowledge, comprehension, application, analysis, synthesis, and evaluation.",
161
+ "We present three case studies showing that this framework expands on existing methods to comprehensively measure how a visualization design facilitates a viewer\u2019s understanding of visualizations.",
162
+ "Although Bloom\u2019s original taxonomy suggests a strong hierarchical structure for some domains, we found few examples of dependent relationships between performance at different levels for our three case studies.",
163
+ "If this level-independence holds across new tested visualizations, the taxonomy could serve to inspire more targeted evaluations of levels of understanding that are relevant to a communication goal."
164
+ ]
165
+ },
166
+ {
167
+ "title": "Example-Driven User Intent Discovery: Empowering Users to Cross the SQL Barrier Through Query by Example",
168
+ "abstract": [
169
+ "Traditional data systems require specialized technical skills where users need to understand the data organization and write precise queries to access data.",
170
+ "Therefore, novice users who lack technical expertise face hurdles in perusing and analyzing data.",
171
+ "Existing tools assist in formulating queries through keyword search, query recommendation, and query auto-completion, but still require some technical expertise.",
172
+ "An alternative method for accessing data is Query by Example (QBE), where users express their data exploration intent simply by providing examples of their intended data.",
173
+ "We study a state-of-the-art QBE system called SQUID, and contrast it with traditional SQL querying.",
174
+ "Our comparative user studies demonstrate that users with varying expertise are significantly more effective and efficient with SQUID than SQL.",
175
+ "We find that SQUID eliminates the barriers in studying the database schema, formalizing task semantics, and writing syntactically correct SQL queries, and thus, substantially alleviates the need for technical expertise in data exploration."
176
+ ]
177
+ },
178
+ {
179
+ "title": "CommunityClick: Towards Improving Inclusivity in Town Halls",
180
+ "abstract": [
181
+ "Despite the lack of inclusive participation from attendees and civic organizers struggle to capture their feedback in reports, local governments continue to depend on traditional methods such as town halls for community consultation.",
182
+ "We present CommunityClick, a community-sourcing system that uses modified iClickers to enable silent attendees' to provide real-time feedback and records meeting audio to capture vocal attendees' feedback.",
183
+ "These feedbacks are combined to generate an augmented meeting transcript and feedback-weighted summary, incorporated into an interactive tool for organizers to author reports.",
184
+ "Our field deployment at a town hall and interviews with 8 organizers demonstrate CommunityClick's utility in improving inclusivity and authoring more comprehensive reports."
185
+ ]
186
+ },
187
+ {
188
+ "title": "Towards Understanding Desiderata for Large-Scale Civic Input Analysis",
189
+ "abstract": [
190
+ "Advancement in digital civics and the emergence of online platforms have enabled vast amounts of community members to share their input on various civic proposals.",
191
+ "The intricacy of the community input analysis process, coupled with the increased scale of community engagement, makes community input analysis particularly challenging.",
192
+ "Civic leaders, who gather, analyze, and make critical decisions based on community input, struggle to make sense of large-scale unstructured community input due to lack of time, analytical skills, and specialized technologies.",
193
+ "In this qualitative study, we investigated civic leaders' requirements that can accelerate the community input analysis process and help them to gain actionable insights to make better decisions.",
194
+ "Our interviews conducted with 14 civic leaders revealed a dichotomous nature of requirements based on their roles and analysis practices.",
195
+ "The interviews also revealed the civic leaders' desire to understand the community's opinions beyond sentiments and how text analysis and visualization can bring structure and enable sensemaking of community input.",
196
+ "This study is our first step towards exploring the design of community input analysis technologies for civic leaders that can contribute to democratic decision-making in digital civics."
197
+ ]
198
+ },
199
+ {
200
+ "title": "Designing Technology for Sociotechnical Problems: Challenges and Considerations",
201
+ "abstract": [
202
+ "Designing technology for sociotechnical problems is challenging due to the heterogeneity of stakeholders\u2019 needs, the diversity among their values and perspectives, and the disparity in their technical skills.",
203
+ "Careful considerations are needed to ensure that data collection is inclusive and representative of the target populations.",
204
+ "It is also important to employ data analysis methods that are compatible with users\u2019 technical skills and are capable of drawing a representative picture of people's values, priorities, and needs.",
205
+ "However, current technical solutions often fail to meet these critical requirements.",
206
+ "In this article, we present a set of empirically-driven design considerations for building technological interventions to address sociotechnical issues.",
207
+ "We then discuss open challenges and tradeoffs around privacy, ethics, bias, uncertainty, and trust.",
208
+ "We conclude with a call to action for researchers to advance the domain knowledge and improve our technological arsenal for addressing sociotechnical problems."
209
+ ]
210
+ },
211
+ {
212
+ "title": "CommunityClick: Capturing and Reporting Community Feedback from Town Halls to Improve Inclusivity",
213
+ "abstract": [
214
+ "Local governments still depend on traditional town halls for community consultation, despite problems such as a lack of inclusive participation for attendees and difficulty for civic organizers to capture attendees' feedback in reports.",
215
+ "Building on a formative study with 66 town hall attendees and 20 organizers, we designed and developed CommunityClick, a communitysourcing system that captures attendees' feedback in an inclusive manner and enables organizers to author more comprehensive reports.",
216
+ "During the meeting, in addition to recording meeting audio to capture vocal attendees' feedback, we modify iClickers to give voice to reticent attendees by allowing them to provide real-time feedback beyond a binary signal.",
217
+ "This information then automatically feeds into a meeting transcript augmented with attendees' feedback and organizers' tags.",
218
+ "The augmented transcript along with a feedback-weighted summary of the transcript generated from text analysis methods is incorporated into an interactive authoring tool for organizers to write reports.",
219
+ "From a field experiment at a town hall meeting, we demonstrate how CommunityClick can improve inclusivity by providing multiple avenues for attendees to share opinions.",
220
+ "Additionally, interviews with eight expert organizers demonstrate CommunityClick's utility in creating more comprehensive and accurate reports to inform critical civic decision-making.",
221
+ "We discuss the possibility of integrating CommunityClick with town hall meetings in the future as well as expanding to other domains."
222
+ ]
223
+ },
224
+ {
225
+ "title": "Exploring How International Graduate Students in the US Seek Support",
226
+ "abstract": [
227
+ "International Graduate Students (IGS) are an integral part of the United States (US) higher education ecosystem.",
228
+ "However, they face enormous challenges while transitioning to the US due to cultural shock, language barriers, and intense academic pressure.",
229
+ "These issues can cause poor mental health, and in some cases, increased risk of self-harm.",
230
+ "The relative ease of access and ubiquity of social technology have the potential for supporting IGS during socio-cultural transitions.",
231
+ "However, little is known about how IGS use social technology for seeking support.",
232
+ "To address this gap, we conducted a qualitative study with the IGS in Western Massachusetts to understand how they seek social support.",
233
+ "Our preliminary findings indicate that our participants preferred seeking informational and network support through social technology.",
234
+ "They expressed that they preferred to seek emotional support in-person and from their close contacts but we found a latent pattern that shows they use technology passively (e.g, following others posts, comments, etc).",
235
+ "We also found that over time, their support-seeking preference changes from people of similar ethnicity to people with similar experiences.",
236
+ "Finally, we identified language as the primary barrier to actively seek any kind of support through technology."
237
+ ]
238
+ },
239
+ {
240
+ "title": "Rehabilitation Games in Real-World Clinical Settings",
241
+ "abstract": [
242
+ "Upper-limb impairments due to stroke can severely affect the quality of life in patients.",
243
+ "Scientific evidence supports that repetitive rehabilitation exercises can improve motor ability in stroke patients.",
244
+ "Rehabilitation games gained tremendous interest among researchers and clinicians because of their potential to make the seemingly mundane, enduring rehabilitation therapies more engaging.",
245
+ "However, routine and longitudinal use of rehabilitation games in real-world clinical settings has not been investigated in depth.",
246
+ "Particularly, we know little about current practices, challenges, and their potential impacts on therapeutic outcomes.",
247
+ "To address this gap, we established a partnership with a rehabilitation hospital where game-assisted rehabilitation was routinely employed over a 2-year period.",
248
+ "We then conducted an observational study, in which we observed 11 game-assisted therapy sessions and interviewed 15 therapists who moderated the therapy.",
249
+ "Significant findings include (1) different engagement patterns of stroke patients in game-assisted therapy, (2) imperative roles of therapists in moderating games and challenges that therapists face during game-assisted therapy, and (3) lack of support for therapists in delivering patient-centered, personalized therapy to individual stroke patients.",
250
+ "Furthermore, we discuss design implications for more effective rehabilitation game therapies that take into consideration both patients and therapists and their specific needs."
251
+ ]
252
+ }
253
+ ],
254
+ "user_kps": [
255
+ "bibliometric mapping",
256
+ "community context",
257
+ "computer-assisted qualitative data analysis software",
258
+ "exploratory queries",
259
+ "human-centered design",
260
+ "impacted communities",
261
+ "international students",
262
+ "participatory design",
263
+ "qualitative survey",
264
+ "rehabilitation gaming system",
265
+ "socio-emotional support",
266
+ "text analytics",
267
+ "text visualization",
268
+ "text-based analysis",
269
+ "university students",
270
+ "user exploration",
271
+ "user participation",
272
+ "user reviews",
273
+ "visual ambiguities",
274
+ "visualization literacy"
275
+ ]
276
+ }
data/users/nshah/seedset-nshah-maple.json ADDED
@@ -0,0 +1,300 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "username": "nshah",
3
+ "s2_authorid": "1737249",
4
+ "papers": [
5
+ {
6
+ "title": "Time is Money: Strategic Timing Games in Proof-of-Stake Protocols",
7
+ "abstract": [
8
+ "We propose a model suggesting that honest-but-rational consensus participants may play timing games, and strategically delay their block proposal to optimize MEV capture, while still ensuring the proposal's timely inclusion in the canonical chain.",
9
+ "In this context, ensuring economic fairness among consensus participants is critical to preserving decentralization.",
10
+ "We contend that a model grounded in honest-but-rational consensus participation provides a more accurate portrayal of behavior in economically incentivized systems such as blockchain protocols.",
11
+ "We empirically investigate timing games on the Ethereum network and demonstrate that while timing games are worth playing, they are not currently being exploited by consensus participants.",
12
+ "By quantifying the marginal value of time, we uncover strong evidence pointing towards their future potential, despite the limited exploitation of MEV capture observed at present."
13
+ ]
14
+ },
15
+ {
16
+ "title": "A Gold Standard Dataset for the Reviewer Assignment Problem",
17
+ "abstract": [
18
+ "Many peer-review venues are either using or looking to use algorithms to assign submissions to reviewers.",
19
+ "The crux of such automated approaches is the notion of the\"similarity score\"--a numerical estimate of the expertise of a reviewer in reviewing a paper--and many algorithms have been proposed to compute these scores.",
20
+ "However, these algorithms have not been subjected to a principled comparison, making it difficult for stakeholders to choose the algorithm in an evidence-based manner.",
21
+ "The key challenge in comparing existing algorithms and developing better algorithms is the lack of the publicly available gold-standard data that would be needed to perform reproducible research.",
22
+ "We address this challenge by collecting a novel dataset of similarity scores that we release to the research community.",
23
+ "Our dataset consists of 477 self-reported expertise scores provided by 58 researchers who evaluated their expertise in reviewing papers they have read previously.",
24
+ "We use this data to compare several popular algorithms employed in computer science conferences and come up with recommendations for stakeholders.",
25
+ "Our main findings are as follows.",
26
+ "First, all algorithms make a non-trivial amount of error.",
27
+ "For the task of ordering two papers in terms of their relevance for a reviewer, the error rates range from 12%-30% in easy cases to 36%-43% in hard cases, highlighting the vital need for more research on the similarity-computation problem.",
28
+ "Second, most existing algorithms are designed to work with titles and abstracts of papers, and in this regime the Specter+MFR algorithm performs best.",
29
+ "Third, to improve performance, it may be important to develop modern deep-learning based algorithms that can make use of the full texts of papers: the classical TD-IDF algorithm enhanced with full texts of papers is on par with the deep-learning based Specter+MFR that cannot make use of this information."
30
+ ]
31
+ },
32
+ {
33
+ "title": "ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing",
34
+ "abstract": [
35
+ "Given the rapid ascent of large language models (LLMs), we study the question: (How) can large language models help in reviewing of scientific papers or proposals?",
36
+ "We first conduct some pilot studies where we find that (i) GPT-4 outperforms other LLMs (Bard, Vicuna, Koala, Alpaca, LLaMa, Dolly, OpenAssistant, StableLM), and (ii) prompting with a specific question (e.g., to identify errors) outperforms prompting to simply write a review.",
37
+ "With these insights, we study the use of LLMs (specifically, GPT-4) for three tasks: 1.",
38
+ "Identifying errors: We construct 13 short computer science papers each with a deliberately inserted error, and ask the LLM to check for the correctness of these papers.",
39
+ "We observe that the LLM finds errors in 7 of them, spanning both mathematical and conceptual errors.",
40
+ "2.",
41
+ "Verifying checklists: We task the LLM to verify 16 closed-ended checklist questions in the respective sections of 15 NeurIPS 2022 papers.",
42
+ "We find that across 119 {checklist question, paper} pairs, the LLM had an 86.6% accuracy.",
43
+ "3.",
44
+ "Choosing the\"better\"paper: We generate 10 pairs of abstracts, deliberately designing each pair in such a way that one abstract was clearly superior than the other.",
45
+ "The LLM, however, struggled to discern these relatively straightforward distinctions accurately, committing errors in its evaluations for 6 out of the 10 pairs.",
46
+ "Based on these experiments, we think that LLMs have a promising use as reviewing assistants for specific reviewing tasks, but not (yet) for complete evaluations of papers or proposals."
47
+ ]
48
+ },
49
+ {
50
+ "title": "Assisting Human Decisions in Document Matching",
51
+ "abstract": [
52
+ "Many practical applications, ranging from paper-reviewer assignment in peer review to job-applicant matching for hiring, require human decision makers to identify relevant matches by combining their expertise with predictions from machine learning models.",
53
+ "In many such model-assisted document matching tasks, the decision makers have stressed the need for assistive information about the model outputs (or the data) to facilitate their decisions.",
54
+ "In this paper, we devise a proxy matching task that allows us to evaluate which kinds of assistive information improve decision makers' performance (in terms of accuracy and time).",
55
+ "Through a crowdsourced (N=271 participants) study, we find that providing black-box model explanations reduces users' accuracy on the matching task, contrary to the commonly-held belief that they can be helpful by allowing better understanding of the model.",
56
+ "On the other hand, custom methods that are designed to closely attend to some task-specific desiderata are found to be effective in improving user performance.",
57
+ "Surprisingly, we also find that the users' perceived utility of assistive information is misaligned with their objective utility (measured through their task performance)."
58
+ ]
59
+ },
60
+ {
61
+ "title": "Counterfactual Evaluation of Peer-Review Assignment Policies",
62
+ "abstract": [
63
+ "Peer review assignment algorithms aim to match research papers to suitable expert reviewers, working to maximize the quality of the resulting reviews.",
64
+ "A key challenge in designing effective assignment policies is evaluating how changes to the assignment algorithm map to changes in review quality.",
65
+ "In this work, we leverage recently proposed policies that introduce randomness in peer-review assignment--in order to mitigate fraud--as a valuable opportunity to evaluate counterfactual assignment policies.",
66
+ "Specifically, we exploit how such randomized assignments provide a positive probability of observing the reviews of many assignment policies of interest.",
67
+ "To address challenges in applying standard off-policy evaluation methods, such as violations of positivity, we introduce novel methods for partial identification based on monotonicity and Lipschitz smoothness assumptions for the mapping between reviewer-paper covariates and outcomes.",
68
+ "We apply our methods to peer-review data from two computer science venues: the TPDP'21 workshop (95 papers and 35 reviewers) and the AAAI'22 conference (8,450 papers and 3,145 reviewers).",
69
+ "We consider estimates of (i) the effect on review quality when changing weights in the assignment algorithm, e.g., weighting reviewers' bids vs. textual similarity (between the review's past papers and the submission), and (ii) the\"cost of randomization\", capturing the difference in expected quality between the perturbed and unperturbed optimal match.",
70
+ "We find that placing higher weight on text similarity results in higher review quality and that introducing randomization in the reviewer-paper assignment only marginally reduces the review quality.",
71
+ "Our methods for partial identification may be of independent interest, while our off-policy approach can likely find use evaluating a broad class of algorithmic matching systems."
72
+ ]
73
+ },
74
+ {
75
+ "title": "Testing for Reviewer Anchoring in Peer Review: A Randomized Controlled Trial",
76
+ "abstract": [
77
+ "Peer review frequently follows a process where reviewers first provide initial reviews, authors respond to these reviews, then reviewers update their reviews based on the authors' response.",
78
+ "There is mixed evidence regarding whether this process is useful, including frequent anecdotal complaints that reviewers insufficiently update their scores.",
79
+ "In this study, we aim to investigate whether reviewers anchor to their original scores when updating their reviews, which serves as a potential explanation for the lack of updates in reviewer scores.",
80
+ "We design a novel randomized controlled trial to test if reviewers exhibit anchoring.",
81
+ "In the experimental condition, participants initially see a flawed version of a paper that is later corrected, while in the control condition, participants only see the correct version.",
82
+ "We take various measures to ensure that in the absence of anchoring, reviewers in the experimental group should revise their scores to be identically distributed to the scores from the control group.",
83
+ "Furthermore, we construct the reviewed paper to maximize the difference between the flawed and corrected versions, and employ deception to hide the true experiment purpose.",
84
+ "Our randomized controlled trial consists of 108 researchers as participants.",
85
+ "First, we find that our intervention was successful at creating a difference in perceived paper quality between the flawed and corrected versions: Using a permutation test with the Mann-Whitney U statistic, we find that the experimental group's initial scores are lower than the control group's scores in both the Evaluation category (Vargha-Delaney A=0.64, p=0.0096) and Overall score (A=0.59, p=0.058).",
86
+ "Next, we test for anchoring by comparing the experimental group's revised scores with the control group's scores.",
87
+ "We find no significant evidence of anchoring in either the Overall (A=0.50, p=0.61) or Evaluation category (A=0.49, p=0.61)."
88
+ ]
89
+ },
90
+ {
91
+ "title": "Batching of Tasks by Users of Pseudonymous Forums: Anonymity Compromise and Protection",
92
+ "abstract": [
93
+ "In a number of applications where anonymity is critical, users act under pseudonyms to preserve their privacy.",
94
+ "For instance, in scientific peer review using forums like OpenReview.net, reviewers make comments on papers that are publicly viewable.",
95
+ "Reviewers who have been assigned multiple papers operate under different pseudonyms across their papers to remain anonymous.",
96
+ "Other examples of publicly visible tasks where users operate under pseudonyms include Wikipedia editing and cryptocurrency transactions.",
97
+ "In these settings, it is common for users to engage in batching - the completion of several similar tasks at the same time.",
98
+ "Batching occurs both due to natural bursts in activity (e.g., a person visits a website and makes many comments at once) or as a productivity strategy used to streamline work.",
99
+ "In peer-review forums such as computer science conferences, reviewers and meta-reviewers are often assigned multiple papers.",
100
+ "We find empirically that reviewers are highly likely to batch their comments and/or reviews across papers.",
101
+ "In analysis of data from a top Computer Science conference with thousands of papers, reviewers, and discussion comments we find that when reviewers and meta-reviewers comment on multiple papers, they have a 30.10% chance of batching their comments within 5 minutes of one other.",
102
+ "In comparison, any randomly chosen pair of reviewers and meta- reviewers had only a 0.66% chance of making comments on different papers within 5 minutes of each other."
103
+ ]
104
+ },
105
+ {
106
+ "title": "To ArXiv or not to ArXiv: A Study Quantifying Pros and Cons of Posting Preprints Online",
107
+ "abstract": [
108
+ "Double-blind conferences have engaged in debates over whether to allow authors to post their papers online on arXiv or elsewhere during the review process.",
109
+ "Independently, some authors of research papers face the dilemma of whether to put their papers on arXiv due to its pros and cons.",
110
+ "We conduct a study to substantiate this debate and dilemma via quantitative measurements.",
111
+ "Specifically, we conducted surveys of reviewers in two top-tier double-blind computer science conferences -- ICML 2021 (5361 submissions and 4699 reviewers) and EC 2021 (498 submissions and 190 reviewers).",
112
+ "Our two main findings are as follows.",
113
+ "First, more than a third of the reviewers self-report searching online for a paper they are assigned to review.",
114
+ "Second, outside the review process, we find that preprints from better-ranked affiliations see a weakly higher visibility, with a correlation of 0.06 in ICML and 0.05 in EC.",
115
+ "In particular, papers associated with the top-10-ranked affiliations had a visibility of approximately 11% in ICML and 22% in EC, whereas the remaining papers had a visibility of 7% and 18% respectively."
116
+ ]
117
+ },
118
+ {
119
+ "title": "T RADEOFFS IN P REVENTING M ANIPULATION IN P APER B IDDING FOR R EVIEWER A SSIGNMENT",
120
+ "abstract": [
121
+ "Many conferences rely on paper bidding as a key component of their reviewer assignment procedure.",
122
+ "These bids are then taken into account when assigning reviewers to help ensure that each reviewer is assigned to suitable papers.",
123
+ "However, despite the benefits of using bids, reliance on paper bidding can allow malicious reviewers to manipulate the paper assignment for unethical purposes (e.g., getting assigned to a friend\u2019s paper).",
124
+ "Several different approaches to preventing this manipulation have been proposed and deployed.",
125
+ "In this paper, we enumerate certain desirable properties that algorithms for addressing bid manipulation should satisfy.",
126
+ "We then offer a high-level analysis of various approaches along with directions for future investigation."
127
+ ]
128
+ },
129
+ {
130
+ "title": "Strategyproofing Peer Assessment via Partitioning: The Price in Terms of Evaluators' Expertise",
131
+ "abstract": [
132
+ "Strategic behavior is a fundamental problem in a variety of real-world applications that require some form of peer assessment, such as peer grading of homeworks, grant proposal review, conference peer review of scientific papers, and peer assessment of employees in organizations.",
133
+ "Since an individual's own work is in competition with the submissions they are evaluating, they may provide dishonest evaluations to increase the relative standing of their own submission.",
134
+ "This issue is typically addressed by partitioning the individuals and assigning them to evaluate the work of only those from different subsets.",
135
+ "Although this method ensures strategyproofness, each submission may require a different type of expertise for effective evaluation.",
136
+ "In this paper, we focus on finding an assignment of evaluators to submissions that maximizes assigned evaluators' expertise subject to the constraint of strategyproofness.",
137
+ "We analyze the price of strategyproofness: that is, the amount of compromise on the assigned evaluators' expertise required in order to get strategyproofness.",
138
+ "We establish several polynomial-time algorithms for strategyproof assignment along with assignment-quality guarantees.",
139
+ "Finally, we evaluate the methods on a dataset from conference peer review."
140
+ ]
141
+ },
142
+ {
143
+ "title": "Allocation Schemes in Analytic Evaluation: Applicant-Centric Holistic or Attribute-Centric Segmented?",
144
+ "abstract": [
145
+ "Many applications such as hiring and university admissions involve evaluation and selection of applicants.",
146
+ "These tasks are fundamentally difficult, and require combining evidence from multiple different aspects (what we term \"attributes\").",
147
+ "In these applications, the number of applicants is often large, and a common practice is to assign the task to multiple evaluators in a distributed fashion.",
148
+ "Specifically, in the often-used holistic allocation, each evaluator is assigned a subset of the applicants, and is asked to assess all relevant information for their assigned applicants.",
149
+ "However, such an evaluation process is subject to issues such as miscalibration (evaluators see only a small fraction of the applicants and may not get a good sense of relative quality), and discrimination (evaluators are influenced by irrelevant information about the applicants).",
150
+ "We identify that such attribute-based evaluation allows alternative allocation schemes.",
151
+ "Specifically, we consider assigning each evaluator more applicants but fewer attributes per applicant, termed segmented allocation.",
152
+ "We compare segmented allocation to holistic allocation on several dimensions via theoretical and experimental methods.",
153
+ "We establish various tradeoffs between these two approaches, and identify conditions under which one approach results in more accurate evaluation than the other."
154
+ ]
155
+ },
156
+ {
157
+ "title": "Integrating Rankings into Quantized Scores in Peer Review",
158
+ "abstract": [
159
+ "In peer review, reviewers are usually asked to provide scores for the papers.",
160
+ "The scores are then used by Area Chairs or Program Chairs in various ways in the decision-making process.",
161
+ "The scores are usually elicited in a quantized form to accommodate the limited cognitive ability of humans to describe their opinions in numerical values.",
162
+ "It has been found that the quantized scores suffer from a large number of ties, thereby leading to a significant loss of information.",
163
+ "To mitigate this issue, conferences have started to ask reviewers to additionally provide a ranking of the papers they have reviewed.",
164
+ "There are however two key challenges.",
165
+ "First, there is no standard procedure for using this ranking information and Area Chairs may use it in different ways (including simply ignoring them), thereby leading to arbitrariness in the peer-review process.",
166
+ "Second, there are no suitable interfaces for judicious use of this data nor methods to incorporate it in existing workflows, thereby leading to inefficiencies.",
167
+ "We take a principled approach to integrate the ranking information into the scores.",
168
+ "The output of our method is an updated score pertaining to each review that also incorporates the rankings.",
169
+ "Our approach addresses the two aforementioned challenges by: (i) ensuring that rankings are incorporated into the updates scores in the same manner for all papers, thereby mitigating arbitrariness, and (ii) allowing to seamlessly use existing interfaces and workflows designed for scores.",
170
+ "We empirically evaluate our method on synthetic datasets as well as on peer reviews from the ICLR 2017 conference, and find that it reduces the error by approximately 30% as compared to the best performing baseline on the ICLR 2017 data."
171
+ ]
172
+ },
173
+ {
174
+ "title": "No Rose for MLE: Inadmissibility of MLE for Evaluation Aggregation Under Levels of Expertise",
175
+ "abstract": [
176
+ "A number of applications including crowd-sourced labeling and peer review require aggregation of labels or evaluations sourced from multiple evaluators.",
177
+ "There is often additional information available pertaining to the evaluators\u2019 expertise.",
178
+ "A natural approach for aggregation is to consider the widely studied Dawid-Skene model (or its extensions incorporating evaluators\u2019 expertise), and employ the standard maximum likelihood estimator (MLE).",
179
+ "While MLE is in general widely used in practice and enjoys a number of appealing theoretical guarantees, in this work we provide a negative result for the MLE.",
180
+ "Specifically, we prove that the MLE is asymptotically inadmissible for a special case of evaluation aggregation with expertise level information.",
181
+ "We show this by constructing an alternative estimator that we show is significantly better than the MLE in certain parameter regimes and at least as good elsewhere.",
182
+ "Finally, simulations reveal that our findings may hold in more general conditions than what we theoretically analyze."
183
+ ]
184
+ },
185
+ {
186
+ "title": "Addendum and Erratum to \u201cThe MDS Queue: Analysing the Latency Performance of Erasure Codes\u201d",
187
+ "abstract": [
188
+ "In the above article [1], we introduced two scheduling policies and analyzed their average job latencies.",
189
+ "With an implicit assumption that the scheduling policies provide sample-path bounds by construction, we claimed that their average job latencies serve as upper and lower bounds on that of a centralized MDS queue.",
190
+ "In this note, we present recently discovered counterexamples, disproving the assumption.",
191
+ "We replace the assumption with a conjecture that the average latency bounds still hold.",
192
+ "We also provide an erratum to the original article to correct any confusing or misleading statements."
193
+ ]
194
+ },
195
+ {
196
+ "title": "Cite-seeing and reviewing: A study on citation bias in peer review",
197
+ "abstract": [
198
+ "Citations play an important role in researchers\u2019 careers as a key factor in evaluation of scientific impact.",
199
+ "Many anecdotes advice authors to exploit this fact and cite prospective reviewers to try obtaining a more positive evaluation for their submission.",
200
+ "In this work, we investigate if such a citation bias actually exists: Does the citation of a reviewer\u2019s own work in a submission cause them to be positively biased towards the submission?",
201
+ "In conjunction with the review process of two flagship conferences in machine learning and algorithmic economics, we execute an observational study to test for citation bias in peer review.",
202
+ "In our analysis, we carefully account for various confounding factors such as paper quality and reviewer expertise, and apply different modeling techniques to alleviate concerns regarding the model mismatch.",
203
+ "Overall, our analysis involves 1,314 papers and 1,717 reviewers and detects citation bias in both venues we consider.",
204
+ "In terms of the effect size, by citing a reviewer\u2019s work, a submission has a non-trivial chance of getting a higher score from the reviewer: an expected increase in the score is approximately 0.23 on a 5-point Likert item.",
205
+ "For reference, a one-point increase of a score by a single reviewer improves the position of a submission by 11% on average."
206
+ ]
207
+ },
208
+ {
209
+ "title": "Tradeoffs in Preventing Manipulation in Paper Bidding for Reviewer Assignment",
210
+ "abstract": [
211
+ "Many conferences rely on paper bidding as a key component of their reviewer assignment procedure.",
212
+ "These bids are then taken into account when assigning reviewers to help ensure that each reviewer is assigned to suitable papers.",
213
+ "However, despite the benefits of using bids, reliance on paper bidding can allow malicious reviewers to manipulate the paper assignment for unethical purposes (e.g., getting assigned to a friend's paper).",
214
+ "Several different approaches to preventing this manipulation have been proposed and deployed.",
215
+ "In this paper, we enumerate certain desirable properties that algorithms for addressing bid manipulation should satisfy.",
216
+ "We then offer a high-level analysis of various approaches along with directions for future investigation."
217
+ ]
218
+ },
219
+ {
220
+ "title": "The Price of Strategyproofing Peer Assessment",
221
+ "abstract": [
222
+ "Strategic behavior is a fundamental problem in a variety of real-world applications that require some form of peer assessment, such as peer grading of assignments, grant proposal review, conference peer review, and peer assessment of employees.",
223
+ "Since an individual\u2019s own work is in competition with the submissions they are evaluating, they may provide dishonest evaluations to increase the relative standing of their own submission.",
224
+ "This issue is typically addressed by partitioning the individuals and assigning them to evaluate the work of only those from different subsets.",
225
+ "Although this method ensures strategyproofness, each submission may require a different type of expertise for effective evaluation.",
226
+ "In this paper, we focus on finding an assignment of evaluators to submissions that maximizes assigned expertise subject to the constraint of strategyproofness.",
227
+ "We analyze the price of strategyproofness: that is, the amount of compromise on the assignment quality required in order to get strategyproofness.",
228
+ "We establish several polynomial-time algorithms for strategyproof assignment along with assignment-quality guarantees.",
229
+ "Finally, we evaluate the methods on a dataset from conference peer review."
230
+ ]
231
+ },
232
+ {
233
+ "title": "The role of author identities in peer review",
234
+ "abstract": [
235
+ "There is widespread debate on whether to anonymize author identities in peer review.",
236
+ "The key argument for anonymization is to mitigate bias, whereas arguments against anonymization posit various uses of author identities in the review process.",
237
+ "The Innovations in Theoretical Computer Science (ITCS) 2023 conference adopted a middle ground by initially anonymizing the author identities from reviewers, revealing them after the reviewer had submitted their initial reviews, and allowing the reviewer to change their review subsequently.",
238
+ "We present an analysis of the reviews pertaining to the identification and use of author identities.",
239
+ "Our key findings are: (I) A majority of reviewers self-report not knowing and being unable to guess the authors\u2019 identities for the papers they were reviewing. (",
240
+ "II) After the initial submission of reviews, 7.1% of reviews changed their overall merit score and 3.8% changed their self-reported reviewer expertise. (",
241
+ "III) There is a very weak and statistically insignificant correlation of the rank of authors\u2019 affiliations with the change in overall merit; there is a weak but statistically significant correlation with respect to change in reviewer expertise.",
242
+ "We also conducted an anonymous survey to obtain opinions from reviewers and authors.",
243
+ "The main findings from the 200 survey responses are: (i) A vast majority of participants favor anonymizing author identities in some form. (",
244
+ "ii) The \u201cmiddle-ground\u201d initiative of ITCS 2023 was appreciated. (",
245
+ "iii) Detecting conflicts of interest is a challenge that needs to be addressed if author identities are anonymized.",
246
+ "Overall, these findings support anonymization of author identities in some form (e.g., as was done in ITCS 2023), as long as there is a robust and efficient way to check conflicts of interest."
247
+ ]
248
+ },
249
+ {
250
+ "title": "Calibration with Privacy in Peer Review",
251
+ "abstract": [
252
+ "This paper is eligible for the Jack Keil Wolf ISIT Student Paper Award.",
253
+ "Reviewers in peer review are often miscalibrated: they may be strict, lenient, extreme, moderate, etc.",
254
+ "A number of algorithms have previously been proposed to calibrate reviews.",
255
+ "Such attempts of calibration can however leak sensitive information about which reviewer reviewed which paper.",
256
+ "In this paper, we identify this problem of calibration with privacy, and provide a foundational building block to address it.",
257
+ "Specifically, we present a theoretical study of this problem under a simplified-yet-challenging model involving two reviewers, two papers, and an MAP-computing adversary.",
258
+ "Our main results establish the Pareto frontier of the tradeoff between privacy (preventing the adversary from inferring reviewer identity) and utility (accepting better papers), and design explicit computationally-efficient algorithms that we prove are Pareto optimal."
259
+ ]
260
+ },
261
+ {
262
+ "title": "Batching of Tasks by Users of Pseudonymous Forums: Anonymity Compromise and Protection",
263
+ "abstract": [
264
+ "There are a number of forums where people participate under pseudonyms.",
265
+ "One example is peer review, where the identity of reviewers for any paper is confidential.",
266
+ "When participating in these forums, people frequently engage in \"batching\": executing multiple related tasks (e.g., commenting on multiple papers) at nearly the same time.",
267
+ "Our empirical analysis shows that batching is common in two applications we consider -- peer review and Wikipedia edits.",
268
+ "In this paper, we identify and address the risk of deanonymization arising from linking batched tasks.",
269
+ "To protect against linkage attacks, we take the approach of adding delay to the posting time of batched tasks.",
270
+ "We first show that under some natural assumptions, no delay mechanism can provide a meaningful differential privacy guarantee.",
271
+ "We therefore propose a \"one-sided\" formulation of differential privacy for protecting against linkage attacks.",
272
+ "We design a mechanism that adds zero-inflated uniform delay to events and show it can preserve privacy.",
273
+ "We prove that this noise distribution is in fact optimal in minimizing expected delay among mechanisms adding independent noise to each event, thereby establishing the Pareto frontier of the trade-off between the expected delay for batched and unbatched events.",
274
+ "Finally, we conduct a series of experiments on Wikipedia and Bitcoin data that corroborate the practical utility of our algorithm in obfuscating batching without introducing onerous delay to a system."
275
+ ]
276
+ }
277
+ ],
278
+ "user_kps": [
279
+ "algorithmic mechanisms",
280
+ "anonymization",
281
+ "assignment algorithms",
282
+ "bidding mechanism",
283
+ "bounded latencies",
284
+ "candidate selection",
285
+ "citations",
286
+ "decentralized mechanisms",
287
+ "expert curation",
288
+ "highest-scoring documents",
289
+ "less-than-expert labeling",
290
+ "malicious manipulations",
291
+ "peer selection",
292
+ "private information retrieval",
293
+ "pseudonymity",
294
+ "ranking mechanisms",
295
+ "review selection",
296
+ "reviewers",
297
+ "textual reviews",
298
+ "user anonymity"
299
+ ]
300
+ }