This package provides the core functionality for accessing and querying organizational data in a performant, indexed manner.
The orgdatacore
package is designed to be a reusable component that can be consumed by multiple services including:
- Slack bots (ci-chat-bot)
- REST APIs
- CLI tools
- Other organizational data consumers
- Fast Data Access: Pre-computed indexes enable O(1) lookups for common queries
- Thread-Safe: Concurrent access with read-write mutex protection
- Hot Reload: Support for dynamic data updates without service restart
- Pluggable Data Sources: Load from files, GCS, or implement custom sources
- Comprehensive Queries: Employee, team, and organization lookups with membership validation
- Cross-Cluster Ready: Designed for distributed deployments with remote data sources
package main
import (
"context"
orgdatacore "github.com/openshift-eng/cyborg-data"
)
func main() {
// Create a new service
service := orgdatacore.NewService()
// Load data using FileDataSource
fileSource := orgdatacore.NewFileDataSource("comprehensive_index_dump.json")
err := service.LoadFromDataSource(context.Background(), fileSource)
if err != nil {
log.Fatal(err)
}
}
package main
import (
"context"
orgdatacore "github.com/openshift-eng/cyborg-data"
)
func main() {
service := orgdatacore.NewService()
// Load from files using DataSource interface
fileSource := orgdatacore.NewFileDataSource("comprehensive_index_dump.json")
err := service.LoadFromDataSource(context.Background(), fileSource)
if err != nil {
log.Fatal(err)
}
// Start watching for file changes
service.StartDataSourceWatcher(context.Background(), fileSource)
}
For GCS support, first add the GCS SDK dependency and build with the gcs tag:
go get cloud.google.com/go/storage
go build -tags gcs
package main
import (
"context"
"time"
orgdatacore "github.com/openshift-eng/cyborg-data"
)
func main() {
service := orgdatacore.NewService()
// Configure GCS
config := orgdatacore.GCSConfig{
Bucket: "orgdata-sensitive",
ObjectPath: "orgdata/comprehensive_index_dump.json",
ProjectID: "your-project-id",
CheckInterval: 5 * time.Minute,
// Optional: provide service account credentials directly
// CredentialsJSON: `{"type":"service_account",...}`,
}
// Load from GCS using the SDK implementation
gcsSource, err := orgdatacore.NewGCSDataSourceWithSDK(context.Background(), config)
if err != nil {
log.Fatal(err)
}
err = service.LoadFromDataSource(context.Background(), gcsSource)
if err != nil {
log.Fatal(err)
}
// Start watching for GCS changes
service.StartDataSourceWatcher(context.Background(), gcsSource)
}
The package expects data in the comprehensive_index_dump.json
format generated by the Python orglib
indexing system from the cyborg project.
All queries use pre-computed indexes for O(1) performance:
- Employee lookups: Direct map access via UID or Slack ID
- Team membership: Pre-computed membership index eliminates tree traversal
- Organization hierarchy: Flattened relationship index for instant ancestry queries
- Slack mappings: Dedicated index for Slack ID → UID resolution
The service uses read-write mutex protection:
- Read operations (queries): Multiple concurrent readers supported
- Write operations (data loading): Exclusive access during updates
- Hot reload: Atomic data replacement without query interruption
// Optimized for fast lookups
type Data struct {
Lookups Lookups // Direct object access: O(1)
Indexes Indexes // Pre-computed relationships: O(1)
}
// Example: Employee lookup
employee := data.Lookups.Employees[uid] // Direct map access
// Example: Team membership
memberships := data.Indexes.Membership.MembershipIndex[uid] // Pre-computed list
// Primary employee lookup by UID
employee := service.GetEmployeeByUID("jsmith")
// Slack integration - lookup by Slack user ID
employee = service.GetEmployeeBySlackID("U123ABC456")
// Returns *Employee with: UID, FullName, Email, JobTitle, SlackUID
// Get team details
team := service.GetTeamByName("Platform SRE")
// Get all teams for an employee
teams := service.GetTeamsForUID("jsmith")
// Check team membership
isMember := service.IsEmployeeInTeam("jsmith", "Platform SRE")
isSlackMember := service.IsSlackUserInTeam("U123ABC456", "Platform SRE")
// Get all team members
members := service.GetTeamMembers("Platform SRE")
// Get organization details
org := service.GetOrgByName("Engineering")
// Check organization membership (includes inherited via teams)
isMember := service.IsEmployeeInOrg("jsmith", "Engineering")
isSlackMember := service.IsSlackUserInOrg("U123ABC456", "Engineering")
// Get complete organizational context
orgs := service.GetUserOrganizations("U123ABC456")
// Returns: teams, orgs, pillars, team_groups user belongs to
Operation | Complexity | Index Used |
---|---|---|
GetEmployeeByUID |
O(1) | lookups.employees |
GetEmployeeBySlackID |
O(1) | indexes.slack_id_mappings |
GetTeamsForUID |
O(1) | indexes.membership.membership_index |
IsEmployeeInTeam |
O(1) | Pre-computed membership |
GetUserOrganizations |
O(1) | Flattened hierarchy index |
No expensive tree traversals - all organizational relationships are pre-computed during indexing.
The package supports pluggable data sources through the DataSource
interface:
-
FileDataSource - Local JSON files
- No additional dependencies
- Supports file watching with polling
- Ideal for development and file-based deployments
-
GCSDataSource - Google Cloud Storage
- Requires GCS SDK:
go get cloud.google.com/go/storage
- Build with
-tags gcs
for full functionality - Supports hot reload with configurable polling interval
- Uses Application Default Credentials (ADC) or service account JSON
- Ideal for production cross-cluster deployments in GCP
- Requires GCS SDK:
Implement the DataSource
interface to create custom sources:
type DataSource interface {
Load(ctx context.Context) (io.ReadCloser, error)
Watch(ctx context.Context, callback func() error) error
String() string
}
Examples of custom sources you could implement:
- HTTP/HTTPS endpoints
- AWS S3 or other S3-compatible storage
- Git repositories
- Database queries
- Redis/Memcached for caching layers
The package uses structured logging via the logr
interface, making it compatible with OpenShift and Kubernetes logging standards.
Default: Uses stdr
(standard library logger wrapper)
OpenShift Integration:
import "k8s.io/klog/v2/klogr"
import orgdatacore "github.com/openshift-eng/cyborg-data"
func init() {
orgdatacore.SetLogger(klogr.New())
}
Log events include data source changes, reload operations, and error conditions with structured key-value context.
- Go 1.19+
- Standard library only (no external dependencies for file sources)
- Optional: GCS SDK for Google Cloud Storage support (
cloud.google.com/go/storage
)