The aim of the final project is to demonstrates that you are able to use machine learning techniques to solve/explore practical and research problems.You can choose two types of project

  1. Competitive projects: Individually work on task and dataset we provide. Participants will be placed on a common leaderboard on Kaggle. Students are typically encouraged to choose this option
  2. Exploratory projects: Work on groups of at most two students on a project topic of your choice.

Grading and Milestones

Your project is worth 30% of the class grade. Please create a Github repostiory to update and matain your project. The grading is broken into the following milestones:

  1. Project team (5 points): inform the instructor the type of project you choose (competitive/exploratory) and the members of your group if it is an exploratory project.

  2. Project Proposal  (10 points): for a competitive project, you need to register for the Kaggle competition and make a dummy submission. You should also submit your kaggle user details on Canvas so that we can track your identify in the competititon.

    For an exploratory project, submit one page document that answers the following questions:

    • Who are in the project team?
    • What problem do you want to address?
    • Why is it interesting? Why do you want to use machine learing rather than traditional/existing methods?

  3. Mid-term report  (20 points): Submit three page document about what you have done for your project (size 11 font).

    For a competitive project, you should have made at least two non-dummy submissions on Kaggle. Your report should describe what you did, any pre-processing that you needed to do, and briefly outline your plan for the rest of the semester. Bonus: by the deadline of report (not including late dates), the top 10 ranked students will receive 3 bonus points

    For an exploratory project, your report should include:
    • What you have done to reach your goal. Note that just “We collected data” will NOT be enough (50%)
    • What is your detailed plan for the rest of the project (30%)
    • Reference to literature (20%)
  4. Final report (65 points): The length of the final report is up to six pages (size 11 font), which should be structured as a small research paper. It should consist of the following content:
    1. Problem definition and motivation - what problem did you choose? Why is it important or interesting? Why did you use machine learning techniques to solve it? (20 points)
    2. Your solution - the details of the machine learning models/algorithms you chose/developed (or proofs for theoretical projects) (20 points).
    3. Experimental results (20 points)
    4. If you had much more time, how would you continue the project (5 points)

    For theoretical project, the solution and experimental evaluation will be graded as one component, with 40 points. Note: the final report must include a Github repository that links to your implementation of your project. We will check your implmentaiton as well. Missing the Github link will lead to zero grade of the final report.

    Important: Experimental evaluations should be rigorous, i.e., choose fair baselines, apply cross-validation for hyper-parameter selection, and report both positive and negative results.

    Bonus for competitive projects: by the deadline of report (not including late dates), the top 10 ranked students will receive 7 bonus points


Topics for Exploratory Projects

Any project using machine learning as a critical step or component will be fine.

If you are looking for ideas of possible projects, come to the office hours and we can brainstorm ideas. Projects can be one of:

  1. An application project, e.g., some machine learning application that you feel interesting.

  2. Reproduction of published results, e.g., you are interested in one machine learning paper and want to reimplement their model/algorithm to reproduce their experimental results.

  3. A theoretical project, e.g., prove interesting properties of a learning algorithm.

  4. An algorithmic project, e.g., develop a new learning algorithm for a particular type of problem.

  5. Your own research, e.g., if you are already working on some project and wish to apply machine learning methods.

In general, choose topics that you feel exciting, and convince me that the topic is important/interesting.


Exploratory Project Examples

  • Kaggle competition tasks
  • Biology and medical study: can we select genes relevant to some disease, such as breast cancer or Alzheimer's Disease?
  • Learning to play board games like Checkers, Settlers of Catan, etc.
  • Stock market prediction: can we predict the trend of the stock price (going up or down)?
  • Detecting sarcasm, hate speech, humor, irony, metaphor, etc in text
  • Classify genre of a song/book/movie/painting
  • Commodity recommendation: can we use customer's purchase records to recommend commodities to old and new customers?
  • Software and security: can we identify Android apps with malwares?
  • Sentiment analysis: can we classify whether a piece of comment is positive or negative?
  • Spam emails detection.
  • ...