Through the course of the term, students will work on a term long project in 3 or 4-person groups.

The objective of each project is to leverage rich and high-quality datasets to answer and address open problems in the health domain. Project tasks can include data mining, modeling, prediction, classification, etc. but most importantly, projects should aim to advance the state-of-the-art in research literature or practice.

To get started, see strategies and resources for finding a research dataset.


  1. In-class presentation

  2. A written report (this can be a publishable paper written to submit to a fitting venue or a report written strictly for this course). In either case, the paper should be written using a target venue's paper template and should follow the appropriate guidelines provided by a relevant journal/conference. See guidelines below.

  3. A project website to document each project and progress. This website will serve as a final portfolio for the work that is done throughout the term. Some examples from previous terms are linked below:

Guidelines for Final Paper

Every project group is expected to write a final paper to share their research results and findings. There will be writing milestones due throughout the term to keep teams on task with writing. Each team should use the provided paper template from the below venue and write their report in accordance with "guidelines for authors":

Alternative venues can be selected based on the team project and preference. Other example venues are:

Important Notes:

  • Manuscript length: ~ 10 - 12 papers not including references. This guideline is for teams writing for ACM HEALTH for which no page limit is given.

  • Organization: Every manuscript must follow instructions provided by the selected venue. An example of submission guidelines for ACM HEALTH can be found here.

  • Template: All papers should use the appropriate template provided by the selected venue. An example of such a template for ACM HEALTH can be found here.

  • Reference Papers: It is always a good idea to have a few examples papers from the selected publication venue that can be used as a reference during the course of writing your own paper. Some example reference papers for ACM HEALTH can be found here.

  • LaTex: All final papers should be written using LaTex. Each project team should use Overleaf - an online, collaborative LaTex editor.

Project Milestones

Coming soon...

P1 - Exploratory & Initial Analysis

Exploratory data analysis (EDA) is a critical and often neglected step in data analysis. In this assignment, students should conduct appropriate EDA that is fitting for their dataset with the primary goal of understanding the dataset fully and identifying the types of research questions that are fitting to answer with that dataset. Some guidelines on conducting EDA can be found on the resources page.

Assignment Requirements:

  1. Accept the assignment here and ensure that all members of your team are added to your project repository.

  2. Write clear and clean code (in python using jupyter notebook/google colab) with appropriate comments and section titles to create 10 or more descriptive figures for exploring various dimensions of your research dataset.

  3. Create your project website using a freely available service (e.g., google sites) that includes:

      • Title

      • Group Members

      • Objective (What is the goal of this project?)

      • Innovation (1-paragraph description on why this work is innovative, you must support this with citations/references to related work in literature)

      • Data Description (1-paragraph description of the dataset and its important features)

      • Exploratory Analysis (embed the written code from #1 here, either as a .pdf or directly on the website)

      • References (using an appropriate citation format)

  4. Submit a link to your project website and github repo via canvas.

      • Ensure that your website is publicly accessible through the link submitted, especially if you use google sites.

Need some inspiration?

See examples from previous terms below: