|
6 | 6 | <div class="container"> |
7 | 7 | <h2>{{ .Title }}</h2> |
8 | 8 | </div> |
9 | | - </section><!-- End Breadcrumbs --> |
| 9 | + </section> |
| 10 | + <!-- End Breadcrumbs --> |
10 | 11 |
|
11 | 12 | <!-- ======= Portfolio Details Section ======= --> |
12 | 13 | <section id="portfolio-details" class="portfolio-details"> |
13 | 14 | <div class="container"> |
14 | 15 | <h3>Culture & Research Philosophy</h3> |
15 | | - <p>Our interdisciplinary team works at the frontier of <strong>computer vision</strong>, <strong>natural language processing</strong> and <strong>human-computer interaction</strong> and its intersection, guided by curiosity and scientific rigor:</p> |
| 16 | + <p> |
| 17 | + Our interdisciplinary team works at the foundational models of |
| 18 | + <strong>computer vision</strong>, |
| 19 | + <strong>natural language processing</strong>, and <strong>multimodal learning</strong>: |
| 20 | + </p> |
16 | 21 | <ul> |
17 | | - <li><strong>Experimental</strong>: Conduct reproducible experiments that advance fundamental understanding.</li> |
18 | | - <li><strong>Computational</strong>: Leverage algorithms, models, and coding expertise to tackle challenging questions.</li> |
| 22 | + <li> |
| 23 | + <strong>Experimental</strong>: Conduct reproducible experiments that |
| 24 | + advance fundamental understanding. |
| 25 | + </li> |
| 26 | + <li> |
| 27 | + <strong>Computational</strong>: Leverage algorithms, models, and |
| 28 | + coding expertise to tackle challenging questions. |
| 29 | + </li> |
19 | 30 | </ul> |
20 | 31 |
|
21 | 32 | <h3>Lab Entry</h3> |
22 | | - <p>We welcome students from all disciplines with strong curiosity and a passion for rigorous, original research. We commonly submit our work to conferences including ACL, NIPS, CVPR and ACM CHI.</p> |
| 33 | + <p> |
| 34 | + We welcome students from all disciplines with strong curiosity and a |
| 35 | + passion for rigorous, original research. We commonly submit our work to |
| 36 | + conferences including ACL, NIPS, CVPR and ACM CHI. |
| 37 | + </p> |
23 | 38 |
|
24 | 39 | <ol class="project-list"> |
25 | | - |
26 | | - <h3>Digital Public Infrastructure</h3> |
27 | | -<ul> |
28 | | - <li> |
29 | | - <strong>Facial Recognition:</strong> |
30 | | - Facial recognition carries significant technical and societal risks. Key thesis problems include ensuring fairness and eliminating demographic bias, developing robust liveness detection and presentation-attack detection (PAD) to defend against deepfakes, 3D masks, photo/video replay and morphing attacks, and establishing ethical deployment practices and privacy-preserving pipelines. |
31 | | - </li> |
32 | | - <li> |
33 | | - <strong>OCR (Optical Character Recognition):</strong> |
34 | | - This is the core technology for digitizing the physical world, but it struggles with complex, real-world text. Key research problems are accurately reading complex layouts (tables, forms), deciphering handwriting, and handling low-quality images. |
35 | | - </li> |
36 | | -</ul> |
37 | | - |
38 | | -<h3>LLM</h3> |
39 | | -<ul> |
40 | | - <li> |
41 | | - <strong>LLM Reasoning and Hallucinations:</strong> |
42 | | - Large Language Models serve as the engine for complex applications like the AI tutor, mental health chatbot, AI knowledge portal, AI interviewer, and AI market research, but they frequently suffer from logical failures and hallucinations. Key research problems for a thesis include developing frameworks to improve reasoning (e.g., Chain-of-Thought) and methods to detect and reduce hallucinations. |
43 | | - </li> |
44 | | - <li> |
45 | | - <strong>Small Language Models (SLMs):</strong> |
46 | | - While LLMs are powerful, they are too large for many real-world applications. This research focuses on training compact models (e.g., <7B parameters) that retain high reasoning capabilities for targeted domains. Key thesis problems involve knowledge distillation, quantization, and efficient training frameworks. |
47 | | - </li> |
48 | | - <li> |
49 | | - <strong>Synthetic Datasets:</strong> |
50 | | - High-quality data is the primary bottleneck for modern AI. This topic focuses on using LLMs to generate training data for other models, solving issues of scarcity and privacy. Key research problems include ensuring diversity and preventing "model collapse" (where models degrade when trained on synthetic data). |
51 | | - </li> |
52 | | - <li> |
53 | | - <strong>Synthetic Personas:</strong> |
54 | | - This topic explores using generative AI to create realistic, AI-driven user profiles as a new method for market research. The core research challenge for a Master's or PhD project is validation: how do we prove that the insights from these synthetic personas accurately reflect real human behavior and avoid AI-generated "echo chambers"? |
55 | | - </li> |
56 | | - <li> |
57 | | - <strong>Human-AI Interaction:</strong> |
58 | | - This is foundational research for ensuring AI makes us *better*, not just replaces us. It focuses on designing collaborative systems that augment human cognition, creativity, and learning. Key research questions for a student project include: How do we design AI that is truly collaborative and understandable? How do we measure its real impact on human learning? And how do we ensure it supports human well-being? |
59 | | - </li> |
60 | | -</ul> |
61 | | - |
62 | | -<h3>Multimodal Learning</h3> |
63 | | -<ul> |
64 | | - <li> |
65 | | - <strong>Medical VQA (Visual Question Answering):</strong> |
66 | | - This technology can act as a "co-pilot" for doctors, helping them interpret medical scans by answering natural language questions. Key problems are the extreme need for accuracy in a high-stakes environment and the lack of large-scale annotated datasets. A core PhD challenge is creating these datasets, which requires expert <strong>manual annotation</strong> from doctors and exploring <strong>synthetic data</strong> to augment rare-disease examples. |
67 | | - </li> |
68 | | - <li> |
69 | | - <strong>Speech2Text and Voice Cloning:</strong> |
70 | | - This technology breaks down communication barriers and creates new possibilities for personalized media. Key problems are achieving accuracy in noisy environments with diverse accents and addressing the ethical risks of deepfake audio. A major project area is creating large-scale, diverse speech datasets, which involves balancing expensive <strong>manual transcription</strong> with <strong>crowdsourced</strong> and <strong>synthetically-generated</strong> audio to cover many languages and accents. |
71 | | - </li> |
72 | | -</ul> |
73 | | - |
74 | | -<h3>Medical Devices</h3> |
75 | | -<ul> |
76 | | - <li> |
77 | | - <strong>EEG (Brain-Computer Interfaces):</strong> |
78 | | - BCI offers a life-changing communication channel for people with severe motor disabilities. Key research problems for a Master's or PhD project involve improving the low speed and signal-to-noise ratio of non-invasive EEG, reducing user fatigue, and engineering robust hybrid BCI systems (like those combining SSVEP and P300 signals) that are fast and reliable enough for daily use. |
79 | | - </li> |
80 | | - <li> |
81 | | - <strong>Raman Spectroscopy:</strong> |
82 | | - This research could lead to a new class of non-invasive medical diagnostics, like a "no-prick" portable glucose monitor. Key problems for a student project involve applying advanced signal processing and machine learning to extract a clear, reliable signal (e.g., glucose) from the highly complex and "noisy" spectroscopic data of human tissue, and then engineering an accurate, portable, and affordable device. |
83 | | - </li> |
84 | | -</ul> |
85 | | -</ol> |
86 | | - |
87 | | - {{/* <ol class="project-list"> |
88 | | - |
89 | | - <!-- Topic 1: Human–AI Interaction --> |
90 | | - <li class="project-item"> |
91 | | - <img src="/img/demo2/human.jpg" alt="Human–AI Interaction" class="project-img"> |
92 | | - <div class="project-text"> |
93 | | - <strong>Topic 1: Human–AI Interaction</strong> |
94 | | - <p> |
95 | | - We study how AI can be designed to augment rather than replace human abilities. |
96 | | - Our focus is on enhancing cognition, creativity, and decision-making while |
97 | | - supporting well-being and fairness. We also explore the design of AI-enabled |
98 | | - applications such as tutors, interviewers, and mental health companions, with |
99 | | - attention to human growth and well-being. |
100 | | - </p> |
101 | | - <ul> |
102 | | - <li>How can AI enhance <b>human cognition</b>, creativity, decision-making, and well-being while reducing biases?</li> |
103 | | - <li>What AI-enabled <b>applications</b> (e.g., AI tutors, AI interviewers, mental health chatbots, medical VQA) best support human growth and well-being?</li> |
104 | | - <li>What design <b>principles</b> and psychological theories ensure these systems are effective?</li> |
105 | | - </ul> |
106 | | - </div> |
107 | | - </li> |
108 | | - |
109 | | - <!-- Topic 2: Natural Language Processing --> |
110 | | - <li class="project-item"> |
111 | | - <img src="/img/demo2/slm.jpg" alt="Natural Language Processing" class="project-img"> |
112 | | - <div class="project-text"> |
113 | | - <strong>Topic 2: Natural Language Processing</strong> |
114 | | - <p> |
115 | | - Our NLP research explores how to make language and speech technologies more |
116 | | - efficient, accessible, and powerful. We investigate training frameworks for |
117 | | - compact small language models that retain much of the capability of large |
118 | | - models, while also advancing speech recognition, diarization, and mixed-language |
119 | | - processing to perform robustly in real-world environments. We are also interested to combine |
120 | | - computer science techniques like ontology or first-order logic to improve reasoning capabilities of small language models. |
121 | | - </p> |
122 | | - <ul> |
123 | | - <li>How can we create a framework for training <b>small language models</b> that achieve LLM-like power in targeted domains?</li> |
124 | | - <li>What methods enable compact models to generalize across tasks with limited data?</li> |
125 | | - <li>Can we combine <b>ontology</b> or first-order logic to improve the reasoning capabilities of small language models?</li> |
126 | | - <li>How do we advance <b>speech recognition</b>, diarization, voice cloning and mixed-language processing for real-world environments?</li> |
127 | | - </ul> |
128 | | - </div> |
129 | | - </li> |
130 | | - |
131 | | - <!-- Topic 3: Neural Architectures & Multimodal Models --> |
132 | | - <li class="project-item"> |
133 | | - <img src="/img/demo2/model.png" alt="Multimodal Models" class="project-img"> |
134 | | - <div class="project-text"> |
135 | | - <strong>Topic 3: Neural Architectures & Multimodal Models</strong> |
136 | | - <p> |
137 | | - Beyond applications, we focus on the foundations of building better models. This includes analyzing neural architectures, adding novel components for improved reasoning, and developing multimodal systems that integrate vision, language, and domain knowledge. We are especially interested in high-impact areas such as medical visual question answering (VQA). |
138 | | - </p> |
139 | | - <ul> |
140 | | - <li>What <b>architectural innovations</b> improve reasoning and generalization in neural networks?</li> |
141 | | - <li>How can we design <b>multimodal models</b> that effectively combine text, speech, and vision?</li> |
142 | | - <li>How can we combine <b>graph</b>, <b>attention</b>, or <b>sequential</b> components for better performance?</li> |
143 | | - </ul> |
144 | | - </div> |
145 | | - </li> |
146 | | - |
147 | | - </ol> */}} |
148 | | - |
149 | | - |
| 40 | + <ul> |
| 41 | + <li> |
| 42 | + <strong>Facial Recognition:</strong> |
| 43 | + Developing accurate and fair facial recognition systems remains a |
| 44 | + significant challenge. Key thesis problems include addressing bias |
| 45 | + in training data, improving recognition accuracy under varying |
| 46 | + lighting and occlusion conditions, and enhancing privacy-preserving |
| 47 | + techniques. |
| 48 | + </li> |
| 49 | + <li> |
| 50 | + <strong>OCR (Optical Character Recognition):</strong> |
| 51 | + Extracting text from images and documents is essential for |
| 52 | + digitization and information retrieval. Key thesis problems include |
| 53 | + improving accuracy on complex layouts and fonts, handling |
| 54 | + multi-language documents, and developing efficient algorithms for |
| 55 | + real-time applications. |
| 56 | + </li> |
| 57 | + <li> |
| 58 | + <strong>Small Language Models (SLMs):</strong> |
| 59 | + While large language models (LLMs) have garnered significant |
| 60 | + attention, SLMs are crucial for resource-constrained environments. |
| 61 | + Key thesis problems include optimizing model architectures for |
| 62 | + efficiency, developing effective training techniques with limited |
| 63 | + data, and ensuring robust performance across diverse tasks and |
| 64 | + domains. |
| 65 | + </li> |
| 66 | + <li> |
| 67 | + <strong>Speech2Text:</strong> |
| 68 | + Converting spoken language into text is crucial for accessibility |
| 69 | + and human-computer interaction. Key thesis problems include |
| 70 | + improving accuracy in noisy environments, handling diverse accents |
| 71 | + and dialects, and developing real-time transcription systems. |
| 72 | + </li> |
| 73 | + <li> |
| 74 | + <strong>Text2Speech and Voice Cloning:</strong> |
| 75 | + Generating natural-sounding speech from text has many applications, |
| 76 | + but challenges remain in prosody, emotion, and voice diversity. Key |
| 77 | + thesis problems include improving the naturalness and expressiveness |
| 78 | + of generated speech, developing robust voice cloning techniques, |
| 79 | + and addressing ethical concerns around consent and misuse. |
| 80 | + </li> |
| 81 | + </ul> |
| 82 | + </ol> |
150 | 83 | </div> |
151 | | - </section><!-- End Portfolio Details Section --> |
| 84 | + </section> |
152 | 85 | </main> |
153 | 86 |
|
154 | 87 | <style> |
|
0 commit comments