This project creates and manages a lightweight relational database for tracking faculty, appointments, and related metrics in Harris College.
faculty-db/
├─ db/ # Contains the DuckDB database file (faculty.duckdb)
│ └─ faculty.duckdb
├─ sql/ # SQL DDL files that define tables and views
│ ├─ 01_people.sql
│ ├─ 02_person_appointments.sql
│ └─ views_01_current_faculty.sql
├─ R/ # R scripts that create schema and load data
│ ├─ data_01_duckdb_schema.R
│ ├─ data_02_load_people.R
│ └─ data_03_load_faculty_appointments.R
├─ data/ # CSV data used to populate the database
│ ├─ faculty_people.csv
│ └─ faculty_appointments.csv
└─ README.md # Project overview and usage instructions
└─ Experiment.md # Quarto for used for experimentation
The Harris College KPIs faculty database is organized around a core person–appointment structure:
peoplestores one record per individual (faculty or staff).person_appointmentsstores one record per appointment, position, or change in role over time.- A single person can have multiple appointments across units or across time (e.g., promotions, transfers, or joint appointments).
erDiagram
people {
TEXT person_id PK
TEXT first_name
TEXT last_name
TEXT email
TEXT orcid
}
person_appointments {
TEXT appt_id PK
TEXT person_id FK
TEXT unit_code
TEXT rank
TEXT track
DOUBLE fte_share
TEXT status
DATE valid_from
DATE valid_to
TEXT source
TEXT notes
}
people ||--o{ person_appointments : "has appointments"
| Table | Type | Description |
|---|---|---|
people |
Core table | Contains basic identifying information for each person. |
person_appointments |
Core table | Contains time-varying details (unit, rank, FTE, status) for each appointment. |
v_current_appointments |
View | Filters person_appointments to show only current and active appointments. |
v_current_faculty |
View | Joins people and v_current_appointments to show all currently active faculty members. |
This normalized structure keeps historical records of each person’s trajectory within the college and makes it easy to query current status, track promotions, and maintain longitudinal consistency.
-
Open the R Project
- Double-click
KPIs.Rprojor open this folder in RStudio.
- Double-click
-
Create the database and tables
source("R/data_01_duckdb_schema.R") -
Load faculty data
source("R/data_02_load_people.R") -
Load faculty appointments
source("R/data_03_load_faculty_appointments.R") -
Explore the data
library(DBI) library(duckdb) con <- dbConnect(duckdb::duckdb("db/faculty.duckdb")) dbGetQuery(con, "SELECT * FROM v_current_faculty LIMIT 10;") dbDisconnect(con)
- Run scripts in numeric order (
data_01_...,data_02_..., etc.). - The DuckDB file (
db/faculty.duckdb) is not tracked in version control. - SQL files define database structure; R scripts manage data loading.
- To rebuild the database from scratch, delete
db/faculty.duckdband rerundata_01_duckdb_schema.R.
There are two core data tables in this project:
- people – one record per person/faculty
- person_appointments – one record per appointment or change in role/unit over time
-
Open or edit the file
data/faculty_people.csv.- Each row represents one person.
- Leave
person_idblank for new people (the loader script will generate a UUID automatically). - Do not modify existing
person_idvalues — these are permanent keys used in other tables.
-
Save the CSV, then run:
source("R/data_02_load_people.R")This script:
- Generates UUIDs for new rows
- Updates changed rows (by replacing matching
person_ids) - Validates required columns (first_name, last_name)
-
Edit
data/faculty_appointments.csv.- Each row represents one appointment for a faculty member.
- The
person_idmust match an existing record in thepeopletable. - Leave
appt_idblank for new appointments (a new UUID will be created). - Set
valid_toto a date when the appointment ends, or leave blank if it’s ongoing.
-
Run:
source("R/data_03_load_faculty_appointments.R")This script:
- Validates that all
person_idvalues exist in thepeopletable - Replaces any existing rows with the same
appt_id - Ensures dates and FTE values are in valid formats
- Validates that all
After loading, confirm the updates were successful:
library(DBI)
library(duckdb)
con <- dbConnect(duckdb::duckdb("db/faculty.duckdb"))
# Check total faculty and appointments
dbGetQuery(con, "SELECT COUNT(*) AS n_people FROM people;")
dbGetQuery(con, "SELECT COUNT(*) AS n_appointments FROM person_appointments;")
# Preview current active faculty
dbGetQuery(con, "SELECT * FROM v_current_faculty LIMIT 10;")
dbDisconnect(con)- Delete this section before pushing to GitHub.
- Create a visual representation of the relationship between the tables in the faculty database.
- Create a simple Shiny app for viewing and loading faculty?
- Start adding other tables we can use to track KPI's (see ClickUp)?
- Push to GH and see if there is a way to automatically create new ClickUp tasks when I create a new GH issue?