Skip to content

Commit c42babe

Browse files
Merge pull request #267 from ContextLab/dev
minor updates to v0.8.1 for compatability with numpy 2.0+ and pandas 2.0+
2 parents adadb0e + 591bbd8 commit c42babe

File tree

8 files changed

+240
-4
lines changed

8 files changed

+240
-4
lines changed

.github/workflows/test.yml

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
name: Tests
2+
3+
on:
4+
push:
5+
branches: [ master, dev ]
6+
pull_request:
7+
branches: [ master, dev ]
8+
9+
jobs:
10+
test:
11+
runs-on: ${{ matrix.os }}
12+
strategy:
13+
fail-fast: false
14+
matrix:
15+
os: [ubuntu-latest, windows-latest, macos-latest]
16+
python-version: ['3.9', '3.10', '3.11', '3.12']
17+
18+
steps:
19+
- uses: actions/checkout@v4
20+
21+
- name: Set up Python ${{ matrix.python-version }}
22+
uses: actions/setup-python@v4
23+
with:
24+
python-version: ${{ matrix.python-version }}
25+
26+
- name: Cache pip dependencies
27+
uses: actions/cache@v3
28+
with:
29+
path: ~/.cache/pip
30+
key: ${{ runner.os }}-pip-${{ matrix.python-version }}-${{ hashFiles('**/requirements.txt') }}
31+
restore-keys: |
32+
${{ runner.os }}-pip-${{ matrix.python-version }}-
33+
${{ runner.os }}-pip-
34+
35+
- name: Install system dependencies (Ubuntu)
36+
if: matrix.os == 'ubuntu-latest'
37+
run: |
38+
sudo apt-get update
39+
sudo apt-get install -y ffmpeg
40+
41+
- name: Install system dependencies (macOS)
42+
if: matrix.os == 'macos-latest'
43+
run: |
44+
brew install ffmpeg
45+
46+
- name: Install system dependencies (Windows)
47+
if: matrix.os == 'windows-latest'
48+
run: |
49+
choco install ffmpeg
50+
continue-on-error: true
51+
52+
- name: Install Python dependencies
53+
run: |
54+
python -m pip install --upgrade pip
55+
pip install pytest pytest-cov
56+
pip install -r requirements.txt
57+
58+
- name: Install package in development mode
59+
run: |
60+
pip install -e .
61+
62+
- name: Run pytest
63+
run: |
64+
pytest -v --tb=short
65+
66+
- name: Run pytest with coverage (Ubuntu Python 3.12 only)
67+
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.12'
68+
run: |
69+
pytest --cov=hypertools --cov-report=xml --cov-report=term-missing
70+
71+
- name: Upload coverage to Codecov (Ubuntu Python 3.12 only)
72+
if: matrix.os == 'ubuntu-latest' && matrix.python-version == '3.12'
73+
uses: codecov/codecov-action@v3
74+
with:
75+
file: ./coverage.xml
76+
flags: unittests
77+
name: codecov-umbrella
78+
fail_ci_if_error: false

CLAUDE.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
HyperTools is a Python library for visualizing and manipulating high-dimensional data. It provides a unified interface for dimensionality reduction, data alignment, clustering, and visualization, built on top of matplotlib, scikit-learn, and seaborn.
8+
9+
## Key Commands
10+
11+
### Testing
12+
- `pytest` - Run all tests from the hypertools/ directory
13+
- `pytest tests/test_<module>.py` - Run tests for a specific module
14+
- `pytest tests/test_<module>.py::test_<function>` - Run a specific test function
15+
16+
### Development Setup
17+
- `pip install -e .` - Install in development mode
18+
- `pip install -r requirements.txt` - Install dependencies
19+
- `pip install -r docs/doc_requirements.txt` - Install documentation dependencies
20+
21+
### Documentation
22+
- `cd docs && make html` - Build HTML documentation
23+
- `cd docs && make clean` - Clean documentation build files
24+
25+
## Code Architecture
26+
27+
### Core Components
28+
29+
**DataGeometry Class** (`hypertools/datageometry.py`)
30+
- Central data container that holds raw data, transformed data, and transformation parameters
31+
- Stores matplotlib figure/axes handles and animation objects
32+
- Contains normalization, reduction, and alignment model parameters
33+
34+
**Main API Functions** (`hypertools/__init__.py`)
35+
- `plot()` - Primary visualization function
36+
- `analyze()` - Data analysis and dimensionality reduction
37+
- `reduce()` - Dimensionality reduction utilities
38+
- `align()` - Data alignment across datasets
39+
- `normalize()` - Data normalization
40+
- `describe()` - Data description and summary
41+
- `cluster()` - Clustering functionality
42+
- `load()` - Data loading utilities
43+
44+
**Tools Module** (`hypertools/tools/`)
45+
- `align.py` - Hyperalignment and Procrustes alignment
46+
- `reduce.py` - Dimensionality reduction (PCA, t-SNE, UMAP, etc.)
47+
- `normalize.py` - Data normalization methods
48+
- `cluster.py` - K-means and other clustering algorithms
49+
- `format_data.py` - Data preprocessing and formatting
50+
- `text2mat.py` - Text-to-matrix conversion
51+
- `df2mat.py` - DataFrame-to-matrix conversion
52+
- `load.py` - Data loading from various sources
53+
- `missing_inds.py` - Missing data handling
54+
- `procrustes.py` - Procrustes analysis
55+
56+
**Plot Module** (`hypertools/plot/`)
57+
- `plot.py` - Main plotting interface and logic
58+
- `backend.py` - matplotlib backend configuration
59+
- `draw.py` - Low-level drawing functions
60+
61+
**External Dependencies** (`hypertools/_externals/`)
62+
- `ppca.py` - Probabilistic Principal Component Analysis
63+
- `srm.py` - Shared Response Model
64+
65+
### Data Flow
66+
67+
1. **Input Processing**: Data is formatted and validated through `format_data()`
68+
2. **Normalization**: Optional data normalization via `normalize()`
69+
3. **Alignment**: Optional cross-dataset alignment via `align()`
70+
4. **Dimensionality Reduction**: Data is reduced via `reduce()`
71+
5. **Clustering**: Optional clustering via `cluster()`
72+
6. **Visualization**: Final plotting through `plot()`
73+
74+
### Key Design Patterns
75+
76+
- **Modular Architecture**: Each major operation (align, reduce, normalize, etc.) is in its own module
77+
- **Unified Interface**: All functions accept similar input formats (lists of arrays, DataFrames, etc.)
78+
- **Flexible Data Types**: Supports numpy arrays, pandas DataFrames, text data, and mixed inputs
79+
- **Matplotlib Integration**: Deep integration with matplotlib for customizable visualizations
80+
- **Animation Support**: Built-in support for animated visualizations
81+
82+
## Development Notes
83+
84+
- The package follows a functional programming style with separate modules for each operation
85+
- All major functions are designed to work with multiple input formats
86+
- The DataGeometry class serves as the central data container and state manager
87+
- Tests are located in `tests/` directory and follow pytest conventions
88+
- Documentation is built with Sphinx and uses example galleries
89+
- The codebase maintains compatibility with Python 3.9+
90+
91+
## Testing Strategy
92+
93+
- Unit tests for individual tools and functions
94+
- Integration tests for end-to-end workflows
95+
- Example-based testing through documentation
96+
- Visual regression testing for plot outputs

hypertools/config.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,8 @@
1-
from pkg_resources import get_distribution
1+
try:
2+
from importlib.metadata import version
3+
except ImportError:
4+
# Fallback for Python < 3.8
5+
from importlib_metadata import version
26

37

4-
__version__ = get_distribution('hypertools').version
8+
__version__ = version('hypertools')

hypertools/tools/describe.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
import warnings
44
import numpy as np
5-
from scipy.stats.stats import pearsonr
5+
from scipy.stats import pearsonr
66
from scipy.spatial.distance import cdist
77
import matplotlib.pyplot as plt
88
import seaborn as sns

notes/github_actions_info.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# GitHub Actions CI/CD Setup
2+
3+
## Test Matrix
4+
The GitHub Actions workflow (`/.github/workflows/test.yml`) runs comprehensive tests on:
5+
6+
### Python Versions
7+
- Python 3.9
8+
- Python 3.10
9+
- Python 3.11
10+
- Python 3.12
11+
12+
### Operating Systems
13+
- Ubuntu Latest (Linux)
14+
- Windows Latest
15+
- macOS Latest
16+
17+
### Features
18+
- **Dependency caching**: Pip cache is used to speed up builds
19+
- **System dependencies**: FFmpeg is installed for animation support
20+
- **Coverage reporting**: Coverage is collected on Ubuntu Python 3.12 and uploaded to Codecov
21+
- **Matrix strategy**: Tests run in parallel across all combinations (12 total jobs)
22+
- **Fail-fast disabled**: All combinations run even if one fails
23+
24+
## Triggers
25+
- Push to `master` or `dev` branches
26+
- Pull requests to `master` or `dev` branches
27+
28+
## Badge
29+
Add this badge to README.md to show build status:
30+
```markdown
31+
[![Tests](https://github.com/ContextLab/hypertools/workflows/Tests/badge.svg)](https://github.com/ContextLab/hypertools/actions)
32+
```
33+
34+
## Local Testing
35+
To run the same tests locally:
36+
```bash
37+
pytest -v --tb=short
38+
```
39+
40+
For coverage:
41+
```bash
42+
pytest --cov=hypertools --cov-report=xml --cov-report=term-missing
43+
```
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
File,Issue Type,Description,Current Status,Priority,Notes
2+
hypertools/tools/reduce.py,NumPy Deprecated,np.string_ removed in NumPy 2.0,Fixed,High,Already fixed in line 116 comment
3+
hypertools/config.py,Package,pkg_resources deprecated,Fixed,High,Replaced with importlib.metadata with fallback
4+
hypertools/tools/describe.py,SciPy Deprecated,scipy.stats.stats.pearsonr deprecated,Fixed,High,Changed import to scipy.stats.pearsonr
5+
hypertools/_shared/helpers.py,NumPy Deprecated,Potential deprecated features,OK,High,No issues found - uses compatible numpy patterns
6+
hypertools/tools/format_data.py,NumPy Array,Array creation and dtype handling,OK,High,Uses np.float64 - compatible with numpy 2.0+
7+
hypertools/tools/align.py,NumPy Array,Matrix operations and dtypes,OK,High,Compatible numpy array operations
8+
hypertools/tools/normalize.py,NumPy Array,Array operations,OK,Medium,Compatible numpy array operations
9+
hypertools/tools/df2mat.py,Pandas,DataFrame to matrix conversion,OK,High,Compatible with pandas 2.0+ patterns
10+
hypertools/datageometry.py,Pandas,DataFrame handling,OK,High,Uses to_dict('list') - compatible with pandas 2.0+
11+
hypertools/tools/text2mat.py,Pandas,Text processing with pandas,OK,Medium,Compatible pandas operations
12+
hypertools/tools/load.py,NumPy/Pandas,Data loading operations,OK,Medium,Compatible with numpy 2.0+ and pandas 2.0+
13+
hypertools/_externals/srm.py,NumPy Random,Uses np.random.seed and np.random.random,OK,Low,These are still supported in numpy 2.0
14+
tests/,NumPy/Pandas,Test compatibility,OK,High,All 129 tests pass with numpy 2.0+ and pandas 2.0+

requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@ numpy>=2.0.0
88
umap-learn>=0.5.5
99
requests>=2.31.0
1010
ipympl>=0.9.3
11+
importlib_metadata>=1.0.0; python_version < "3.8"

tests/test_reduce.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ def test_reduce_MiniBatchDictionaryLearning():
9292

9393

9494
def test_reduce_TSNE():
95-
reduced_data_3d = reducer(data, reduce='TSNE', ndims=3)
95+
reduced_data_3d = reducer(data, reduce={'model': 'TSNE', 'params': {'perplexity': 5}}, ndims=3)
9696
assert reduced_data_3d[0].shape==(10,3)
9797

9898

0 commit comments

Comments
 (0)