You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: arxiv.json
+42Lines changed: 42 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -37014,5 +37014,47 @@
37014
37014
"pub_date": "2024-12-16",
37015
37015
"summary": "Session-based recommendation seeks to forecast the next item a user will be interested in, based on their interaction sequences. Due to limited interaction data, session-based recommendation faces the challenge of limited data availability. Traditional methods enhance feature learning by constructing complex models to generate positive and negative samples. This paper proposes a session-based recommendation model using Single Positive optimization loss and Graph Learning (SPGL) to deal with the problem of data sparsity, high model complexity and weak transferability. SPGL utilizes graph convolutional networks to generate global item representations and batch session representations, effectively capturing intrinsic relationships between items. The use of single positive optimization loss improves uniformity of item representations, thereby enhancing recommendation accuracy. In the intent extractor, SPGL considers the hop count of the adjacency matrix when constructing the directed global graph to fully integrate spatial information. It also takes into account the reverse positional information of items when constructing session representations to incorporate temporal information. Comparative experiments across three benchmark datasets, Tmall, RetailRocket and Diginetica, demonstrate the model's effectiveness. The source code can be accessed on https://github.com/liang-tian-tian/SPGL .",
"title": "SepLLM: Accelerate Large Language Models by Compressing One Segment into\n One Separator",
37020
+
"url": "http://arxiv.org/abs/2412.12094v1",
37021
+
"pub_date": "2024-12-16",
37022
+
"summary": "Large Language Models (LLMs) have exhibited exceptional performance across a spectrum of natural language processing tasks. However, their substantial sizes pose considerable challenges, particularly in computational demands and inference speed, due to their quadratic complexity. In this work, we have identified a key pattern: certain seemingly meaningless special tokens (i.e., separators) contribute disproportionately to attention scores compared to semantically meaningful tokens. This observation suggests that information of the segments between these separator tokens can be effectively condensed into the separator tokens themselves without significant information loss. Guided by this insight, we introduce SepLLM, a plug-and-play framework that accelerates inference by compressing these segments and eliminating redundant tokens. Additionally, we implement efficient kernels for training acceleration. Experimental results across training-free, training-from-scratch, and post-training settings demonstrate SepLLM's effectiveness. Notably, using the Llama-3-8B backbone, SepLLM achieves over 50% reduction in KV cache on the GSM8K-CoT benchmark while maintaining comparable performance. Furthermore, in streaming settings, SepLLM effectively processes sequences of up to 4 million tokens or more while maintaining consistent language modeling capabilities.",
"title": "Making FETCH! Happen: Finding Emergent Dog Whistles Through Common\n Habitats",
37027
+
"url": "http://arxiv.org/abs/2412.12072v1",
37028
+
"pub_date": "2024-12-16",
37029
+
"summary": "WARNING: This paper contains content that maybe upsetting or offensive to some readers. Dog whistles are coded expressions with dual meanings: one intended for the general public (outgroup) and another that conveys a specific message to an intended audience (ingroup). Often, these expressions are used to convey controversial political opinions while maintaining plausible deniability and slip by content moderation filters. Identification of dog whistles relies on curated lexicons, which have trouble keeping up to date. We introduce \\textbf{FETCH!}, a task for finding novel dog whistles in massive social media corpora. We find that state-of-the-art systems fail to achieve meaningful results across three distinct social media case studies. We present \\textbf{EarShot}, a novel system that combines the strengths of vector databases and Large Language Models (LLMs) to efficiently and effectively identify new dog whistles.",
"title": "Semi-automated analysis of audio-recorded lessons: The case of teachers'\n engaging messages",
37034
+
"url": "http://arxiv.org/abs/2412.12062v1",
37035
+
"pub_date": "2024-12-16",
37036
+
"summary": "Engaging messages delivered by teachers are a key aspect of the classroom discourse that influences student outcomes. However, improving this communication is challenging due to difficulties in obtaining observations. This study presents a methodology for efficiently extracting actual observations of engaging messages from audio-recorded lessons. We collected 2,477 audio-recorded lessons from 75 teachers over two academic years. Using automatic transcription and keyword-based filtering analysis, we identified and classified engaging messages. This method reduced the information to be analysed by 90%, optimising the time and resources required compared to traditional manual coding. Subsequent descriptive analysis revealed that the most used messages emphasised the future benefits of participating in school activities. In addition, the use of engaging messages decreased as the academic year progressed. This study offers insights for researchers seeking to extract information from teachers' discourse in naturalistic settings and provides useful information for designing interventions to improve teachers' communication strategies.",
"title": "Virtual Agent-Based Communication Skills Training to Facilitate Health\n Persuasion Among Peers",
37041
+
"url": "http://arxiv.org/abs/2412.12061v1",
37042
+
"pub_date": "2024-12-16",
37043
+
"summary": "Many laypeople are motivated to improve the health behavior of their family or friends but do not know where to start, especially if the health behavior is potentially stigmatizing or controversial. We present an approach that uses virtual agents to coach community-based volunteers in health counseling techniques, such as motivational interviewing, and allows them to practice these skills in role-playing scenarios. We use this approach in a virtual agent-based system to increase COVID-19 vaccination by empowering users to influence their social network. In a between-subjects comparative design study, we test the effects of agent system interactivity and role-playing functionality on counseling outcomes, with participants evaluated by standardized patients and objective judges. We find that all versions are effective at producing peer counselors who score adequately on a standardized measure of counseling competence, and that participants were significantly more satisfied with interactive virtual agents compared to passive viewing of the training material. We discuss design implications for interpersonal skills training systems based on our findings.",
"title": "How Private are Language Models in Abstractive Summarization?",
37048
+
"url": "http://arxiv.org/abs/2412.12040v1",
37049
+
"pub_date": "2024-12-16",
37050
+
"summary": "Language models (LMs) have shown outstanding performance in text summarization including sensitive domains such as medicine and law. In these settings, it is important that personally identifying information (PII) included in the source document should not leak in the summary. Prior efforts have mostly focused on studying how LMs may inadvertently elicit PII from training data. However, to what extent LMs can provide privacy-preserving summaries given a non-private source document remains under-explored. In this paper, we perform a comprehensive study across two closed- and three open-weight LMs of different sizes and families. We experiment with prompting and fine-tuning strategies for privacy-preservation across a range of summarization datasets across three domains. Our extensive quantitative and qualitative analysis including human evaluation shows that LMs often cannot prevent PII leakage on their summaries and that current widely-used metrics cannot capture context dependent privacy risks.",
"title": "Can LLM Prompting Serve as a Proxy for Static Analysis in Vulnerability\n Detection",
37055
+
"url": "http://arxiv.org/abs/2412.12039v1",
37056
+
"pub_date": "2024-12-16",
37057
+
"summary": "Despite their remarkable success, large language models (LLMs) have shown limited ability on applied tasks such as vulnerability detection. We investigate various prompting strategies for vulnerability detection and, as part of this exploration, propose a prompting strategy that integrates natural language descriptions of vulnerabilities with a contrastive chain-of-thought reasoning approach, augmented using contrastive samples from a synthetic dataset. Our study highlights the potential of LLMs to detect vulnerabilities by integrating natural language descriptions, contrastive reasoning, and synthetic examples into a comprehensive prompting framework. Our results show that this approach can enhance LLM understanding of vulnerabilities. On a high-quality vulnerability detection dataset such as SVEN, our prompting strategies can improve accuracies, F1-scores, and pairwise accuracies by 23%, 11%, and 14%, respectively.",
0 commit comments