Organizational Data Core Package

This package provides the core functionality for accessing and querying organizational data in a performant, indexed manner.

Overview

The orgdatacore package is designed to be a reusable component that can be consumed by multiple services including:

Slack bots (ci-chat-bot)
REST APIs
CLI tools
Other organizational data consumers

Features

Fast Data Access: Pre-computed indexes enable O(1) lookups for common queries
Thread-Safe: Concurrent access with read-write mutex protection
Hot Reload: Support for dynamic data updates without service restart
Pluggable Data Sources: Load from files, GCS, or implement custom sources
Comprehensive Queries: Employee, team, and organization lookups with membership validation
Cross-Cluster Ready: Designed for distributed deployments with remote data sources

Usage

Basic Setup with Files

package main

import (
    "context"
    orgdatacore "github.com/openshift-eng/cyborg-data"
)

func main() {
    // Create a new service
    service := orgdatacore.NewService()
    
    // Load data using FileDataSource
    fileSource := orgdatacore.NewFileDataSource("comprehensive_index_dump.json")
    err := service.LoadFromDataSource(context.Background(), fileSource)
    if err != nil {
        log.Fatal(err)
    }
}

Using DataSource Interface

package main

import (
    "context"
    orgdatacore "github.com/openshift-eng/cyborg-data"
)

func main() {
    service := orgdatacore.NewService()
    
    // Load from files using DataSource interface
    fileSource := orgdatacore.NewFileDataSource("comprehensive_index_dump.json")
    err := service.LoadFromDataSource(context.Background(), fileSource)
    if err != nil {
        log.Fatal(err)
    }
    
    // Start watching for file changes
    service.StartDataSourceWatcher(context.Background(), fileSource)
}

Google Cloud Storage Support

For GCS support, first add the GCS SDK dependency and build with the gcs tag:

go get cloud.google.com/go/storage
go build -tags gcs

package main

import (
    "context"
    "time"
    orgdatacore "github.com/openshift-eng/cyborg-data"
)

func main() {
    service := orgdatacore.NewService()
    
    // Configure GCS
    config := orgdatacore.GCSConfig{
        Bucket:        "orgdata-sensitive",
        ObjectPath:    "orgdata/comprehensive_index_dump.json",
        ProjectID:     "your-project-id",
        CheckInterval: 5 * time.Minute,
        // Optional: provide service account credentials directly
        // CredentialsJSON: `{"type":"service_account",...}`,
    }
    
    // Load from GCS using the SDK implementation
    gcsSource, err := orgdatacore.NewGCSDataSourceWithSDK(context.Background(), config)
    if err != nil {
        log.Fatal(err)
    }
    
    err = service.LoadFromDataSource(context.Background(), gcsSource)
    if err != nil {
        log.Fatal(err)
    }
    
    // Start watching for GCS changes
    service.StartDataSourceWatcher(context.Background(), gcsSource)
}

Data Structure

The package expects data in the comprehensive_index_dump.json format generated by the Python orglib indexing system from the cyborg project.

Service Architecture

Query Performance

All queries use pre-computed indexes for O(1) performance:

Employee lookups: Direct map access via UID or Slack ID
Team membership: Pre-computed membership index eliminates tree traversal
Organization hierarchy: Flattened relationship index for instant ancestry queries
Slack mappings: Dedicated index for Slack ID → UID resolution

Thread Safety

The service uses read-write mutex protection:

Read operations (queries): Multiple concurrent readers supported
Write operations (data loading): Exclusive access during updates
Hot reload: Atomic data replacement without query interruption

Data Structure Optimization

// Optimized for fast lookups
type Data struct {
    Lookups  Lookups  // Direct object access: O(1)
    Indexes  Indexes  // Pre-computed relationships: O(1) 
}

// Example: Employee lookup
employee := data.Lookups.Employees[uid]  // Direct map access

// Example: Team membership 
memberships := data.Indexes.Membership.MembershipIndex[uid]  // Pre-computed list

Service Methods

Employee Queries

// Primary employee lookup by UID
employee := service.GetEmployeeByUID("jsmith")

// Slack integration - lookup by Slack user ID
employee = service.GetEmployeeBySlackID("U123ABC456")

// Returns *Employee with: UID, FullName, Email, JobTitle, SlackUID

Team Operations

// Get team details
team := service.GetTeamByName("Platform SRE")

// Get all teams for an employee
teams := service.GetTeamsForUID("jsmith")

// Check team membership
isMember := service.IsEmployeeInTeam("jsmith", "Platform SRE")
isSlackMember := service.IsSlackUserInTeam("U123ABC456", "Platform SRE")

// Get all team members
members := service.GetTeamMembers("Platform SRE")

Organization Queries

// Get organization details
org := service.GetOrgByName("Engineering")

// Check organization membership (includes inherited via teams)
isMember := service.IsEmployeeInOrg("jsmith", "Engineering")
isSlackMember := service.IsSlackUserInOrg("U123ABC456", "Engineering")

// Get complete organizational context
orgs := service.GetUserOrganizations("U123ABC456")
// Returns: teams, orgs, pillars, team_groups user belongs to

Performance Characteristics

Operation	Complexity	Index Used
`GetEmployeeByUID`	O(1)	`lookups.employees`
`GetEmployeeBySlackID`	O(1)	`indexes.slack_id_mappings`
`GetTeamsForUID`	O(1)	`indexes.membership.membership_index`
`IsEmployeeInTeam`	O(1)	Pre-computed membership
`GetUserOrganizations`	O(1)	Flattened hierarchy index

No expensive tree traversals - all organizational relationships are pre-computed during indexing.

Data Sources

The package supports pluggable data sources through the DataSource interface:

Built-in Data Sources

FileDataSource - Local JSON files
- No additional dependencies
- Supports file watching with polling
- Ideal for development and file-based deployments
GCSDataSource - Google Cloud Storage
- Requires GCS SDK: go get cloud.google.com/go/storage
- Build with -tags gcs for full functionality
- Supports hot reload with configurable polling interval
- Uses Application Default Credentials (ADC) or service account JSON
- Ideal for production cross-cluster deployments in GCP

Custom Data Sources

Implement the DataSource interface to create custom sources:

type DataSource interface {
    Load(ctx context.Context) (io.ReadCloser, error)
    Watch(ctx context.Context, callback func() error) error
    String() string
}

Examples of custom sources you could implement:

HTTP/HTTPS endpoints
AWS S3 or other S3-compatible storage
Git repositories
Database queries
Redis/Memcached for caching layers

Logging

The package uses structured logging via the logr interface, making it compatible with OpenShift and Kubernetes logging standards.

Default: Uses stdr (standard library logger wrapper) OpenShift Integration:

import "k8s.io/klog/v2/klogr"
import orgdatacore "github.com/openshift-eng/cyborg-data"

func init() {
    orgdatacore.SetLogger(klogr.New())
}

Log events include data source changes, reload operations, and error conditions with structured key-value context.

Dependencies

Go 1.19+
Standard library only (no external dependencies for file sources)
Optional: GCS SDK for Google Cloud Storage support (cloud.google.com/go/storage)

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
example		example
testdata		testdata
vendor		vendor
.ci-operator.yaml		.ci-operator.yaml
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
OWNERS		OWNERS
README.md		README.md
benchmark_test.go		benchmark_test.go
constants.go		constants.go
datasource_test.go		datasource_test.go
datasources.go		datasources.go
employee_test.go		employee_test.go
gcs_datasource.go		gcs_datasource.go
go.mod		go.mod
go.sum		go.sum
interface.go		interface.go
logger.go		logger.go
organization_test.go		organization_test.go
service.go		service.go
service_edge_cases_test.go		service_edge_cases_test.go
service_test.go		service_test.go
team_test.go		team_test.go
test_helpers.go		test_helpers.go
types.go		types.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Organizational Data Core Package

Overview

Features

Usage

Basic Setup with Files

Using DataSource Interface

Google Cloud Storage Support

Data Structure

Service Architecture

Query Performance

Thread Safety

Data Structure Optimization

Service Methods

Employee Queries

Team Operations

Organization Queries

Performance Characteristics

Data Sources

Built-in Data Sources

Custom Data Sources

Logging

Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

openshift-eng/cyborg-data

Folders and files

Latest commit

History

Repository files navigation

Organizational Data Core Package

Overview

Features

Usage

Basic Setup with Files

Using DataSource Interface

Google Cloud Storage Support

Data Structure

Service Architecture

Query Performance

Thread Safety

Data Structure Optimization

Service Methods

Employee Queries

Team Operations

Organization Queries

Performance Characteristics

Data Sources

Built-in Data Sources

Custom Data Sources

Logging

Dependencies

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages