GitHub Copilot

An introduction to AgenticAI assistants

Janick Weberpals, RPh, PhD

jweberpals.academic@gmail.com

August 23, 2025

GitHub Copilot: Your AI Coding Assistant

What is GitHub Copilot?

Figure 1: GitHub Copilot overview (figure created via GPT-5).

Disclaimer

COI

I do not have any financial or personal relationships with Microsoft or OpenAI that could influence the content of this presentation.

Some Important Notes Before We Dive In

AI is developing rapidly, so will GitHub Copilot
Copilot is not perfect; it can make mistakes
Always review suggestions carefully
- See it as a helpful assistant/buddy, not a replacement to use your own intellect
Copilot does not replace human expertise (yet)
We will only cover the basics today
- If you’d be intereted in a more advanced session, let us know via the course feedback form!

How to Access Copilot

Sign up: github.com/features/copilot
Requirements: GitHub account (free, subscription)
More detailed instructions can be found on course website (section 2)

Figure 2: Sign up for GitHub Copilot.

How does Copilot work?

Figure 3: Illustration of principle GitHub Copilot functionality.

Where Can You Access GitHub Copilot?

Visual Studio Code and GitHub Codespaces: Popular code editor with Copilot extension
In VS Code: Extensions → Search “GitHub Copilot” → Install

Figure 4: GitHub Copilot extension in Visual Studio Code.

Where Can You Access GitHub Copilot?

RStudio: Integrated with RStudio for R development

Figure 5: GitHub Copilot in RStudio.

Where Can You Access GitHub Copilot?

Figure 6: GitHub Copilot in RStudio.

How to Use Copilot To Enhance Your Work in Pharmacoepidemiology

Use Cases for Analytic Code

Code comprehension & review

Ask Copilot to explain complex code
- “Line coding is too long and complex” no longer a valid argument to not provide code along with a research study to enhance transparency and reproducibility (e.g., in peer-review)
Translate code between languages (e.g., R to Python)
Summarize code changes
Suggest improvements or catch potential bugs

Code development, testing and documentation

Write new functions, scripts, or analyses faster
Generate boilerplate code for data import, cleaning, modeling
Create unit tests for your functions (!!!)
Suggest test cases (simulated data) and edge scenarios
Auto-generate docstrings, comments, and README content

Use Cases for git

GitHub Copilot for Git

Automate common git workflows
- Example: generate commit messages based on code changes (git diff)
Tracking
- “What changes were made to this repository in the last 5 commits?”
Staging & committing
Advanced usage (create branches, help with pull requests and merging, etc.)

Copilot Modes

Figure 7: Overview of Copilot modes available in VSCode.

Prompt engineering

Tip

Start general, then get specific
- Write a function in R. The function should do X and Y based on Z input.
Provide context and examples
- Return a propensity score-matched cohort in a data.frame that contains the patient ID (patid), covariates (all columns that start with “c_”) and the computed propensity score as “ps”
Segment complex tasks into sub-tasks
- First create a script called “xyz_function.R”, then write the function, then test the function with dataset 1, address any potential errors and then test again
Iterate on prompts based on feedback
- Please refactor the code such that it returns the age variable as a categori

More recommendations and examples can be found here.

Model Choice

Table 1: Comparison of GitHub Copilot models.

Model	Task area	Excels at (primary use case)	Additional capabilities
GPT-4.1	General-purpose coding and writing	Fast, accurate code completions and explanations	Agent mode, vision
GPT-4o	General-purpose coding and writing	Fast completions and visual input understanding	Agent mode, vision
o3	Deep reasoning and debugging	Multi-step problem solving and architecture-level code analysis	Reasoning
o4-mini	Fast help with simple or repetitive tasks	Fast, reliable answers to lightweight coding questions	Lower latency
Claude Opus 4.1	Deep reasoning and debugging	Complex problem-solving challenges, sophisticated reasoning	Reasoning, vision
Claude Opus 4	Deep reasoning and debugging	Complex problem-solving challenges, sophisticated reasoning	Reasoning, vision
Claude Sonnet 3.5	Fast help with simple or repetitive tasks	Quick responses for code, syntax, and documentation	Agent mode, vision
Claude Sonnet 3.7	Deep reasoning and debugging	Structured reasoning across large, complex codebases	Agent mode, vision
Claude Sonnet 4	Deep reasoning and debugging	Performance and practicality, perfectly balanced for coding workflows	Agent mode, vision
Gemini 2.5 Pro	Deep reasoning and debugging	Complex code generation, debugging, and research workflows	Reasoning, vision
Gemini 2.0 Flash	Working with visuals (diagrams, screenshots)	Real-time responses and visual reasoning for UI and diagram-based tasks	Vision
Taken from https://docs.github.com/en/copilot/reference/ai-models/model-comparison

Copilot in Action (Ask Mode)

Example use case: Comprehend and review existing code

Code example taken from Abdelaziz et al., PDS 2024¹

“To illustrate an application of the approach, we created and analyzed an active comparator new user cohort. Briefly, we implemented an active comparator new user design comparing the risk of bladder cancer of sodium–glucose co-transporter 2 (SGLT-2) inhibitors and glucagon-like peptide 1 receptor agonists (GLP-1RAs) inspired by a recent study from Abrahami et al.² […]”

Copilot in Action (Ask Mode)

Example: Comprehend and review existing code

Example prompt

How did the author define the continuous enrollment periods. Provide details and show me the code that was used to derive the continuous enrollment periods.

Figure 8: Example code/GitHub repository by Abdelaziz et al.¹

Copilot in Action (Ask Mode)

Copilot in Action (Agent Mode)

Example: Simulate dataset and create a Table 1

Example prompt

Generate a simulated datasets (data.frame) using the R programming language. The simulated dataset should resemble the main characteristics and baseline distributions of a fictional randomized trial with two treatment arms and a few baseline covariates, including age (continuous: mean 65), sex (categorical: male (40%) and female (60%)), disease stage (ordinal: I, II, III, IV, with each 25% prevalence) and biomarker status (binary: TRUE (30%), FALSE (70%)).

Use a step by step approach:

Create a new file called “01_simulate_data.R”
Simulate the dataset and do not store the simulated dataset
Illustrate the baseline characteristics and distributions by treatment arm using the tbl_summary() function of the gtsummary package. You don’t need to execute the code

Copilot in Action (Agent Mode)

Copilot in Action (Edit Mode)

Copilot in Action (Git Workflow)

Questions?

Give it a try in your next project!

References

Abdelaziz AI, Hanson KA, Gaber CE, Lee TA. Optimizing large real-world data analysis with parquet files in r: A step-by-step tutorial. Pharmacoepidemiology and Drug Safety 2024; 33: e5728.

Abrahami D, Tesfaye H, Yin H, et al. Sodium–glucose cotransporter 2 inhibitors and the short-term risk of bladder cancer: An international multisite cohort study. Diabetes Care 2022; 45: 2907–2917.