Keyu-He
diff --git a/‎_bibliography/papers.bib‎
Lines changed: 14 additions & 2 deletions b/‎_bibliography/papers.bib‎
Lines changed: 14 additions & 2 deletions
diff --git a/‎_config.yml‎
Lines changed: 2 additions & 2 deletions b/‎_config.yml‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎_pages/about.md‎
Lines changed: 8 additions & 8 deletions b/‎_pages/about.md‎
Lines changed: 8 additions & 8 deletions
diff --git a/‎_pages/profiles.md‎
Lines changed: 2 additions & 2 deletions b/‎_pages/profiles.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎_posts/2024-01-27-advanced-images.md‎
Lines changed: 2 additions & 2 deletions b/‎_posts/2024-01-27-advanced-images.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎assets/img/prof_pic.jpg‎ ‎assets/img/keyu.jpg‎assets/img/prof_pic.jpg renamed to assets/img/keyu.jpg b/‎assets/img/prof_pic.jpg‎ ‎assets/img/keyu.jpg‎assets/img/prof_pic.jpg renamed to assets/img/keyu.jpg
diff --git a/‎assets/img/wechat-qr.jpg‎
-224 KB b/‎assets/img/wechat-qr.jpg‎
-224 KB
diff --git a/‎assets/img/wechat_qr.jpg‎
210 KB b/‎assets/img/wechat_qr.jpg‎
210 KB
diff --git a/‎assets/pdf/VLM_Rationales.pdf‎
3.31 MB b/‎assets/pdf/VLM_Rationales.pdf‎
3.31 MB
diff --git a/‎lighthouse_results/desktop/alshedivat_github_io_al_folio_.html‎
Lines changed: 1 addition & 1 deletion b/‎lighthouse_results/desktop/alshedivat_github_io_al_folio_.html‎
Lines changed: 1 addition & 1 deletion
@@ -20,15 +20,27 @@ @article{keyu2025explanations
   journal={ACL Findings},
   abstract={Language models today are widely used in education, yet their ability to tailor responses for learners with varied informational needs and knowledge backgrounds remains under-explored. To this end, we introduce ELI-WHY, a benchmark of 13.4K "Why" questions to assess the pedagogical capabilities of LLMs. We then conduct two extensive human studies to assess the utility of LLM-generated explanatory answers (explanations) on our benchmark, tailored to three distinct educational grades: elementary, high-school, and graduate school. In our first study, human raters assume the role of an "educator" to assess model explanations' fit to different educational grades. We find that GPT-4-generated explanations match their intended educational background only 50% of the time, compared to 79% for human-curated explanations. In our second study, human raters assume the role of a learner to assess if an explanation fits their own informational needs. Results show that users deemed GPT-4-generated explanations relatively 20% less suited to their informational needs, particularly for advanced learners. Additionally, automated evaluation metrics reveal that GPT-4 explanations for different informational needs remain indistinguishable in their grade-level, limiting their pedagogical effectiveness. These findings suggest that LLMs' ability to follow inference-time instructions alone is insufficient for producing high-utility explanations tailored to users' informational needs.},
   selected={true},
+  doi={10.48550/arXiv.2506.14200},
   pdf={ELI_Why_Evaluating_the_Pe.pdf}
 }
 
 @article{keyu2025vlm,
-  title={Beyond the Text: How Explanation Qualities Influence User Trust in Visual Language Models},
+  title={Believing without Seeing: Quality Scores for Contextualizing Vision-Language Model Explanations},
   author={Keyu He and Tejas Srinivasan and Brihi Joshi and Xiang Ren and Jesse Thomason and Swabha Swayamdipta},
   year={2025},
   journal={Submitted to NeurIPS, Under Review},
   abstract={When people query Vision-Language Models (VLMs) but cannot see the accompanying visual context (e.g. for blind and low-vision users), augmenting VLM predictions with natural language explanations can signal which model predictions are reliable. However, prior work has found that explanations can easily convince users that inaccurate VLM predictions are correct. To remedy undesirable overreliance on VLM predictions, we propose evaluating two complementary qualities of VLM-generated explanations via two quality scoring functions. We propose Visual Fidelity, which captures how faithful an explanation is to the visual context, and Contrastiveness, which captures how well the explanation identifies visual details that distinguish the model’s prediction from plausible alternatives. On the A-OKVQA and VizWiz tasks, these quality scoring functions are better calibrated with model correctness than existing explanation qualities. We conduct a user study in which participants have to decide whether a VLM prediction is accurate without viewing its visual context. We observe that showing our quality scores alongside VLM explanations improves participants’ accuracy at predicting VLM correctness by 11.1%, including a 15.4% reduction in the rate of falsely believing incorrect predictions. These findings highlight the utility of explanation quality scores in fostering appropriate reliance on VLM predictions.},
-  selected={true}
+  selected={true},
+  pdf={VLM_Rationales.pdf}
 }
 
+@article{enhancing-debugging,
+  title = {Enhancing Debugging Skills of LLMs with Prompt Engineering},
+  author = {Keyu He* and Max Li* and Joseph Liu*},
+  abstract = {This paper presents a comprehensive study on improving the debugging capabilities of Large Language Models (LLMs) like GPT-3.5, focusing on the application of prompt engineering techniques. We explore the efficacy of few-shot learning, chain-of-thought prompting, and a baseline zero-shot model in enhancing LLMs' ability to debug code. Utilizing static and dynamic evaluation metrics, the study rigorously assesses the debugging proficiency of these models. By introducing different types of bugs, including procedural and language model-generated errors, and applying varied prompting strategies, we provide a deeper understanding of LLMs' debugging capabilities. The results provide insights into the limitation of debugging capabilities of GPT-3.5 Turbo, even with the assistance of various prompting techniques. Source code of our evaluation method and bug generation techniques are in GitHub repository: https://github.com/FrankHe2002/CSCI499FinalProject.},
+  journal = {Tech Report},
+  selected = {false},
+  year = {2023},
+  month = dec,
+  pdf = {Enhancing-Debugging.pdf}
+}
@@ -14,7 +14,7 @@ footer_text: >
   Hosted by <a href="https://pages.github.com/" target="_blank">GitHub Pages</a>.
 keywords: jekyll, jekyll-theme, academic-website, portfolio-website # add your own keywords or leave empty
 lang: en # the language of your site (for example: en, fr, cn, ru, etc.)
-icon: prof_pic.jpg # the emoji used as the favicon (alternatively, provide image name in /assets/img/)
+icon: keyu.jpg # the emoji used as the favicon (alternatively, provide image name in /assets/img/)
 
 url: https://Keyu-He.github.io # the base hostname & protocol for your site
 baseurl: # the subpath of your site, e.g. /blog/. Leave blank for root
@@ -102,7 +102,7 @@ spotify_id: # your spotify id
 stackoverflow_id: # your stackoverflow id
 telegram_username: # your Telegram user name
 unsplash_id: # your unsplash id
-wechat_qr: wechat-qr.jpg # filename of your wechat qr-code saved as an image (e.g., wechat-qr.png if saved to assets/img/wechat-qr.png)
+wechat_qr: wechat_qr.jpg # filename of your wechat qr-code saved as an image (e.g., wechat_qr.png if saved to assets/img/wechat_qr.png)
 whatsapp_number: # your WhatsApp number (full phone number in international format. Omit any zeroes, brackets, or dashes when adding the phone number in international format.)
 wikidata_id: # your wikidata id
 wikipedia_id: # your wikipedia id (Case sensitive)
 
@@ -6,28 +6,28 @@ permalink: /
 
 profile:
   align: right
-  image: prof_pic.jpg
+  image: keyu.jpg
   image_circular: true # crops the image to make it circular
-  # more_info: >
-  #   <p>555 your office number</p>
-  #   <p>123 your address street</p>
-  #   <p>Your City, State 12345</p>
+  more_info: >
+    <p>6404 Gates & Hillman Centers</p>
+    <p>4902 Forbes Ave</p>
+    <p>Pittsburgh, PA 15213</p>
 
 news: true # includes a list of news items
 selected_papers: true # includes a list of papers marked as "selected={true}"
 social: true # includes social icons at the bottom of the page
 ---
 
-Hi! I am Keyu, a senior undergraduate student double majoring in **Computer Science** and **Applied & Computational Mathematics** at the University of Southern California ([USC](https://www.usc.edu/)), with a minor in **Artificial Intelligence Applications**.
+Hi! I’m Keyu He, a Master of Intelligent Information Systems ([MIIS](https://miis.cs.cmu.edu/)) student at Carnegie Mellon University ([CMU](https://www.cmu.edu/)). Previously, I earned my B.S. from the University of Southern California ([USC](https://www.usc.edu/)), where I double‑majored in **Computer Science** and **Applied & Computational Mathematics** and minored in **Artificial Intelligence Applications**.
 
-Passionate about Natural Language Processing (NLP), I’m fascinated by its potential to bridge communication gaps and foster understanding. I am currently working on NLP research as a member of the [INK Lab](https://inklab.usc.edu/) and [DILL Lab](https://dill-lab.github.io/), under the guidance of Professors [Xiang Ren](https://www.seanre.com/) and [Swabha Swayamdipta](https://swabhs.com/), and the mentorship of [Brihi Joshi](https://brihijoshi.github.io/). My work focuses on **explainability and building user trust** in AI systems.
+Passionate about Natural Language Processing ([NLP](https://en.wikipedia.org/wiki/Natural_language_processing)), I’m fascinated by its potential to bridge communication gaps and foster understanding. I worked on NLP research as a member of the [INK Lab](https://inklab.usc.edu/) and [DILL Lab](https://dill-lab.github.io/), under the guidance of Professors [Xiang Ren](https://www.seanre.com/) and [Swabha Swayamdipta](https://swabhs.com/), and the mentorship of [Brihi Joshi](https://brihijoshi.github.io/). My work focuses on **explainability and building user trust** in AI systems.
 
 ### Academic Achievements & Honors
 - **USC Center for Undergraduate Research in Viterbi Engineering (CURVE) fellow**.
   - Fellowship awarded multiple times (Spring 2024 -- Spring 2025), with a total funding amount of $6,750.
 - **USC Academic Achievement Award**: Fall 2022 - Fall 2024. 
   - This award covered 11 units of tuition costs in total, amounting to approximately $24,000.
-- **USC Dornsife Dean’s List & USC Viterbi Dean's List**: 2021-2024.
+- **USC Dornsife Dean’s List & USC Viterbi Dean's List**: 2021-2025.
 - **4th Place, USC Integral Bee Competition**: 2022.
 
 ### Current Research
 
@@ -10,15 +10,15 @@ nav_order: 7
 #   # if you want to include more than one profile, just replicate the following block
 #   # and create one content file for each profile inside _pages/
 #   - align: right
-#     image: prof_pic.jpg
+#     image: keyu.jpg
 #     content: about_einstein.md
 #     image_circular: false # crops the image to make it circular
 #     more_info: >
 #       <p>555 your office number</p>
 #       <p>123 your address street</p>
 #       <p>Your City, State 12345</p>
 #   - align: left
-#     image: prof_pic.jpg
+#     image: keyu.jpg
 #     content: about_einstein.md
 #     image_circular: false # crops the image to make it circular
 #     more_info: >
 
@@ -30,6 +30,6 @@ This is a simple image slider. It uses the [Swiper](https://swiperjs.com/) libra
 This is a simple image comparison slider. It uses the [img-comparison-slider](https://img-comparison-slider.sneas.io/) library. Check the [examples page](https://img-comparison-slider.sneas.io/examples.html) for more information of what you can achieve with it.
 
 <img-comparison-slider>
-  {% include figure.liquid path="assets/img/prof_pic.jpg" class="img-fluid rounded z-depth-1" slot="first" %}
-  {% include figure.liquid path="assets/img/prof_pic_color.png" class="img-fluid rounded z-depth-1" slot="second" %}
+  {% include figure.liquid path="assets/img/keyu.jpg" class="img-fluid rounded z-depth-1" slot="first" %}
+  {% include figure.liquid path="assets/img/keyu_color.png" class="img-fluid rounded z-depth-1" slot="second" %}
 </img-comparison-slider>