-
Notifications
You must be signed in to change notification settings - Fork 0
Project Overview
Abel de Wit edited this page May 27, 2019
·
1 revision
It is highly recommended to not reinvent the wheel from scratch. Therefore, when picking a dataset and a topic, it is important to know what existing resources that can be leveraged. Asking the following questions to yourself:
- What is the dataset about? What is the problem? Why this problem interesting and essential?
- Is there an existing approach? Can you apply this approach to your dataset?
- How to evaluate your idea/approach? What data can be used for evaluating the proposed approach? Is the data set available?
- What software packages and resources that you can use for implementing your idea?
- What is the best and the worse outcomes of the project? (i.e., measure your risk).
- Who will be your fellow group member? Do they have special expertise? How to split the workload?
- Some possible ways to find a topic are:
- Take an existing problem we mentioned (or we will mention) in class and come up with some ideas.
- Read a published paper carefully and ask yourself if there is any challenge left from the paper or if you can improve the proposed approach.
- There are many NLP shared tasks at Semieval, CoNLL, and some workshops. A shared task often provides a well-defined problem and data set, allowing different teams to fairly compared their approaches. You can use the shared tasks in previous years as a testbed of your approaches or participate in a shared task in this year (it is okay if you cannot get the results on the final test data set when the semester end. Just evaluate your approach on the development split),
- A two-page project proposal is due on May 12th (no delays here).
- Pick a title
- State your motivation, your plan, and the expected outcome of your project.
- You can use this chance to draft the introduction and the related work section of your final report.
- You should address the questions in the "Pick a topic" section in your proposal.
- Cite & describe at least 2 pieces of relevant prior work • Describe the dataset(s) you are going to use
- Outline pre-existing software/tools you are going to use • Outline preliminary experiments & evaluation methods
- Each team must submit a written project report. You should assume the report is like a short conference paper.
- If you have a demo system, you can include some screenshots of your system.
- It is also recommended to include a discussion of how your research work can be further extended.
- It is required to use the provided ACL Latex style files (also available in Overleaf) and submit the report in PDF format.
- The report should be no more than 4 pages (plus unlimited pages for references).
- A concise and short report is better than a lengthy one.
- Different projects may be graded differently.
- For example, a project that reimplements some proposed approaches and performs comprehensive comparison might get a high score even if there is no new approach proposed.
- An ambitious idea (e.g. build a chatbot that can replace Jerry) that fails but goes deep into why it failed might also get a high score.
- Here is a general grading rubric:
- 20% on clarity (Is it clear what was done? Is the report well-written and well- structured? Is the idea well-motivated? Is the literature review comprehensive?)
- 25% on soundness and correctness. (Is the technical approach well-chosen and deep enough? Is the implementation correct?)
- 25% on the meaningful comparison. (Are the experiment settings correct? Are the approaches experimental results correctly interpreted?)
- 10% on novelty and substance. (What are the new things that we can learn from this project?)
- Each project team is expected to make a ”pitch” (due to time shortage) of their project.
- I expect everyone to attend the final project presentation unless special circumstances.
- Make a website (github) about your project. Use that to advertise your work! Place everything there (your code, your report, your figures, etc.)
- The website should highlight your problem, dataset & approach, tailored for general audience (include figures, examples, etc.)
For a successful project (in this course and in general), you must: * Identify an open problem, present a hypothesis about it, and survey therelevantliterature. * Design and run an experiment to test that hypothesis. * Analyze the results to reveal what your experiment tells us about yourhypothesis.
- Do this early!
- Learn about common methods, datasets, and libraries that will make your life easier.
- Buy yourself more time to think about the questions that have or haven’t been answered in the literature.
- How to identify relevant papers
- Do a keyword search on Google Scholar (or the ACL Anthology)
- Download the papers that seem most relevant
- Skim the abstracts, intros, & previous work sections
- Identify papers that look relevant, appear often, & have lots of citations on Google Scholar
- Download those papers 6.
- Return to step 3