top of page

Data Science

M.Sc Interdisciplinary Medical Science 9506

Course Description:​

This course provides a foundational introduction to high-throughput sequencing, data analysis, and statistical concepts essential for interpreting complex biological datasets. Students will gain practical experience with R, including markdown and knitr for reproducible data analysis, and explore advanced topics such as multivariate datasets and compositional data. Core statistical principles, including Bayesian thinking and concepts like regression to the mean and the gambler’s fallacy, are introduced to enhance data interpretation skills. Key techniques such as Principal Component Analysis (PCA) and compositional biplots are taught in the context of high-throughput sequencing error structures. By the end of the course, students will be equipped to perform reproducible analyses and critically evaluate correlations in compositional datasets.

​

Learning Objectives: 

  1. Understand the principles and data types associated with high-throughput sequencing.

  2. Develop proficiency in R, including using markdown and knitr for functional note-taking and reproducible analyses.

  3. Apply statistical concepts such as Bayesian thinking, regression to the mean, and the gambler’s fallacy to real-world data problems.

  4. Analyze multivariate and compositional datasets, including understanding geometric means and error structures.

  5. Utilize techniques like PCA and compositional biplots to explore and interpret correlations in complex datasets.

Course Artifacts

These are some of the works I have produced in this course. Click on the button to view the entire project! 

Screen Shot 2025-04-23 at 12.33.35 AM.png

The objective of the Patent Issues assignment was to examine a key challenge in intellectual property law and evaluate how courts interpret patent validity through principles such as novelty, utility, and full disclosure. Through our analysis of the AstraZeneca v. Apotex case and the role of the Promise Doctrine, I gained insight into how legal decisions influence the lifecycle and protection of scientific innovations. This assignment strengthened my understanding of patent creation strategies and underscored the importance of clarity and evidence in securing and defending intellectual property rights.

Screen Shot 2025-04-23 at 12.39.37 AM.png

The goal of the Patent Issues assignment was to critically assess a real-world challenge in intellectual property law by examining how legal standards like novelty, utility, and disclosure are interpreted in court decisions. Our group analyzed the AstraZeneca v. Apotex case to explore the implications of the now-overturned Promise Doctrine. This assignment helped me understand the legal nuances of patent protection and how judicial decisions can impact innovation timelines and commercialization strategies. It emphasized the importance of clear, evidence-based claims in patent applications and deepened my appreciation for the intersection of science, law, and ethics.

Reflection

This course served as an excellent refresher, reinforcing my ability to analyze datasets and apply statistical logic effectively. It provided a deeper understanding of how to interpret and manage high-throughput sequencing data while revisiting essential concepts like Bayesian thinking and multivariate analysis. The practical experience with R and tools like PCA and compositional biplots allowed me to refine my skills in testing research hypotheses. By connecting statistical reasoning with real-world datasets, the course enhanced my confidence in applying analytical techniques to generate meaningful insights in research.​

Dr. Gregory Gloor, Professor,  PhD

"Your ability to apply R coding to a variety of real-world data science challenges was consistently impressive. Whether it was data wrangling, visualization, or statistical modeling, your assignments demonstrated both technical skill and thoughtful interpretation. You met—and often exceeded—expectations. Well done!"

bottom of page