Skip to content

RafaelSdeSouza/capivara

Repository files navigation

Capivara

arXiv GitHub Coverage Status Last Commit

Overview

Capivara provides spectral segmentation tools for Integral Field Unit (IFU) data cubes. Version 0.2.0 adds built-in missing-data support in the exact workflow, a medoid-based large-cube engine, Sagui-style white-light starlet masking, variance-aware spectral summaries, and SNR-guided component selection.

The core segmentation API is intentionally small:

  • segment() for the standard exact workflow, including missing spectral channels.
  • segment_big_cube() for very large cubes where exact pairwise distances are too expensive in RAM.

Current Capivara Mosaic

Current Capivara mosaic

Sagui Comparison Mosaic

Sagui comparison mosaic

Both mosaics are displayed with the same fixed size to make the visual comparison easier.

What’s New In 0.2.0

  • segment() now handles missing spectral channels by default.
  • segment_big_cube() now uses block medoids rather than block averages, improving compact structures in large cubes.
  • both backends can optionally use a Sagui-style photometric mask built from the white-light image.
  • summarize_cluster_spectra() returns median, summed, and inverse-variance-weighted spectra.
  • choose_ncomp_by_snr() helps select Ncomp from an SNR threshold when a variance cube is available.
  • torch is now optional; Capivara falls back to base distance calculations when it is not installed.

Installation

Install Capivara from GitHub using the following commands:

install.packages("remotes")
remotes::install_github("RafaelSdeSouza/capivara")
library(capivara)

Optional GPU acceleration:

install.packages("torch")
torch::install_torch()

Usage

Basic Segmentation

library(capivara)
cube <- FITSio::readFITS("manga-8140-12703-LOGCUBE.fits")

res <- segment(cube, Ncomp = 20)
plot_cluster(res)

Missing Data

segment() now handles missing spectral channels directly, so the same exact workflow works on masked cubes without a separate function.

Large Cubes

Use segment_big_cube() when the cube is too large for the traditional exact workflow and the all-pairs distance matrix would likely exhaust available RAM.

res_large <- segment_big_cube(cube, Ncomp = 20, block_size = 6)

Sagui-style Starlet Masking

The starlet mask can be built from the white-light image, then applied back to the full cube before clustering.

res_star <- segment(
  cube,
  Ncomp = 20,
  use_starlet_mask = TRUE,
  starlet_J = 5,
  starlet_scales = 2:5,
  include_coarse = FALSE,
  denoise_k = 0,
  positive_only = TRUE,
  mask_mode = "na"
)

plot_cluster(res_star)

For large cubes, the same white-light mask can be combined with the scalable backend:

res_star_large <- segment_big_cube(
  cube,
  Ncomp = 20,
  use_starlet_mask = TRUE,
  block_size = 6
)

Comparison on manga-8140-12703-LOGCUBE.fits using the full frame: the starlet mask is computed from the white-light image, not from a cropped cube.

Reproducible MaNGA Examples

These comparison panels were generated with the current public API on full MaNGA cubes, using the exact and large-cube backends under the same package configuration.

MaNGA 8135-12701

MaNGA 8135-12701 comparison

MaNGA 8443-6102

MaNGA 8443-6102 comparison

MaNGA 10224-6104

MaNGA 10224-6104 comparison

MaNGA 11749-12701

MaNGA 11749-12701 comparison

Variance-aware Analysis

When a variance cube is available, use it in the post-segmentation analysis layer rather than in segment() itself.

var_cube <- FITSio::readFITS("manga-8140-12703-VARCUBE.fits")

choice <- choose_ncomp_by_snr(
  cube,
  var_cube = var_cube$imDat,
  k_values = 4:20,
  target_snr = 20
)

res <- segment(cube, Ncomp = choice$Ncomp)
spec_summary <- summarize_cluster_spectra(res, var_cube = var_cube$imDat)

median_spectra are useful as robust representative spectra for inspection. For flux and SNR calculations, sum_spectra or the inverse-variance-weighted summary are usually better choices.

Reconstructed Cubes

Use reconstruct_cluster_cube() to build a representative cube from cluster templates, or reconstruct_flux_preserving_cube() when you want a model cube that preserves the summed flux spectrum of the segmented data for later spectral fitting.

rep_cube <- reconstruct_cluster_cube(res_star, template = "median")
fit_cube <- reconstruct_flux_preserving_cube(res_star)

Release Notes

See NEWS.md for the 0.2.0 release summary.

Attribution

If you use the Capivara code in your research, please cite the Capivara paper. A BibTeX entry is:

@article{desouza2025capivara,
  author = {de Souza, Rafael S. and Dahmer-Hahn, Luis G. and Shen, Shiyin and Chies-Santos, Ana L. and Chen, Mi and Rahna, P. T. and Ye, Renhao and Tahmasebzade, Behzad},
  title = {CAPIVARA: a spectral-based segmentation method for IFU data cubes},
  journal = {Monthly Notices of the Royal Astronomical Society},
  year = {2025},
  volume = {539},
  number = {4},
  pages = {3166--3179},
  doi = {10.1093/mnras/staf688}
}

Dependencies

  • torch: Optional GPU-accelerated tensor computations.
  • ggplot2: Visualization.
  • FITSio: Reading and handling FITS files.
  • reshape2: Data manipulation.

References

  1. MaNGA Survey: Bundy, Kevin, et al. “Overview of the SDSS-IV MaNGA Survey: Mapping Nearby Galaxies at Apache Point Observatory.” The Astrophysical Journal 798.1 (2015): 7. DOI: 10.1088/0004-637X/798/1/7
  2. Capivara Code: RafaelSdeSouza/capivara
  3. Capivara Methodology: Souza, R. S. de, et al. (2025). CAPIVARA: A spectral-based segmentation method for IFU data cubes. Monthly Notices of the Royal Astronomical Society, 539(4), 3166–3179. https://doi.org/10.1093/mnras/staf688
  4. Torch in R: Paszke, Adam, et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library.” Advances in Neural Information Processing Systems 2019. For more information, check the Capivara GitHub webpage.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors