Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 26, 2025

Fix for assortativity_nominal with character types

  • Understand the issue: assortativity_nominal fails with character labels (like 'A', 'B') because as.numeric() produces NAs
  • Examine the current implementation in R/aaa-auto.R
  • Create test cases for character labels (letters and string numbers)
  • Implement fix: Use conditional as.integer(as.factor(types)) only for character vectors
  • Run tests to validate the fix
  • Format code with air
  • Run full test suite (6959 tests pass)
  • Update documentation
  • Address review feedback: Only convert character vectors for clarity
  • Run security checks

Summary

Fixed bug where assortativity_nominal() would fail with character labels. The fix conditionally uses as.integer(as.factor(types)) to convert character vectors to consecutive integers starting from 1, which is what the C implementation expects.

Changes:

  1. Modified assortativity_nominal() in R/assortativity.R to conditionally convert only character types using as.integer(as.factor())
  2. Updated documentation to clarify character vector handling
  3. Added comprehensive test cases

Benefits:

  • Fixes character type bug (the reported issue)
  • Avoids unnecessary conversion overhead for numeric inputs
  • Maintains backward compatibility
  • All tests pass
Original prompt

This section details on the original issue you should resolve

<issue_title>bug in assortativity_nominal</issue_title>
<issue_description>Describe the bug
When trying to compute assortativity for categorical labels attached to nodes, an error is returned when the labels are letters but not when the labels are numbers. It is mentioned that the bug is unexpected and to report it with a reproducible example.

To reproduce

The script below reproduces the bug

library(igraph)

set.seed(2)
g <- sample_gnm(10, 20)

V(g)$random1 <- sample(c(1, 2), 10, replace = T)
V(g)$random2 <- sample(c('1', '2'), 10, replace = T)
V(g)$random3 <- sample(c('A', 'B'), 10, replace = T)

# compute the assortativity of this node attribute 
assortativity_nominal(g, types = V(g)$random1) # this is OK 
assortativity_nominal(g, types = V(g)$random2) # this is OK
assortativity_nominal(g, types = V(g)$random3) # this leads to the output below

Error in assortativity_nominal(g, types = V(g)$random3) :
At core/core/vector.pmt:126 : Assertion failed: size >= 0. This is an unexpected igraph error; please report this as a bug, along with the steps to reproduce it.
Please restart your R session to avoid crashes or other surprising behavior.
In addition: Warning message:
In assortativity_nominal(g, types = V(g)$random3) :
NAs introduced by coercion

Version information
Which version of igraph are you using and where did you obtain it?

igraph_1.6.0
from CRAN

R version 4.3.3 (2024-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
</issue_description>

Comments on the Issue (you are @copilot in this section)

@szhorvat > igraph_1.6.0

Please always test with the latest version before reporting issues. I cannot reproduce it with 2.0.2. Can you try this version?</comment_new>
<comment_new>@szhorvat
With the current version of igraph, you will need to use consecutive integers, starting at 1, to represent categories. Names won't work.

I agree that the situation is not ideal. I must note that I am not an R user or R programmer, so I can't judge very well what is reasonable in R. Neither do I make the decision about whether we will do anything about this. But here's a suggestion for an improvement.

Let me know what you think @krlmlr

First, notice that the error message is not very good. It talks about negative indices, as in C we index the categories from 0. In R we index from 1. There's thus the usual problem about how to phrase the error to fit both. igraph/igraph#2119

The types argument here represents categorical data. It would indeed be very nice if other representations than indices could be supported, for example string names. Categorical data appears in many places in igraph as an input argument, such as:

  • Community / cluster / component membership as passed to function that take groups, such as: modularity calculation; vertex contraction
  • Vertex and edge colours in isomorphism functions
  • Categorical assortativity

... and possibly others I'm not thinking of now.

Some high-level languages support categorical data directly. Isn't this what factor is for in R? Mathematica does not have a data type for this, but I do have functions to convert other representations to category indices, and I allow categories to be specified in flexible ways.

Categories also have different representations, each being most useful in specific contexts: we can assign a category name to each object/vertex: vertex 1 is "blue", vertex 2 is "red", vertex 3 is "blue"; or we can list the category members: "blue" contains {1, 3}, "red" contains {2}.

Should we then have a special Stimulus type specifically for categorical data? This would make it easy to auto-generate code that can handle various kinds of category representations that are convenient in the host language, and convert each to simple 0-based membership vectors that can be sent to C. The raw C errors we see here would never appear: error checking would be done by the function that converts the category representations. Users could work much more conveniently with such data.

Opinions, @krlmlr and @ntamas ?

Potentially related:

Should we then have a special Stimulus type specifically for categorical data?

Yes, IMO it would be a good idea. Currently we have VERTEX_COLORS and EDGE_COLORS (probably only in the develop branch?). I think it's an ill-suited name but semantically it means the same thing, isn't it?</comment_new>
<comment_new>@krlmlr
A simple types <- as.integer(as.factor(types)) in `assortativity_nom...


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Fix bug in assortativity_nominal for categorical labels fix: Support character labels in assortativity_nominal() Oct 26, 2025
Copilot AI requested a review from krlmlr October 26, 2025 19:22
Copilot finished work on behalf of krlmlr October 26, 2025 19:22
@krlmlr
Copy link
Contributor

krlmlr commented Oct 26, 2025

@copilot: Only convert character vectors to integers, for clarity.

Copy link
Contributor Author

Copilot AI commented Oct 26, 2025

@copilot: Only convert character vectors to integers, for clarity.

Done in 175cfd1. Now only character vectors are converted using as.integer(as.factor()), numeric inputs pass through unchanged.

Copilot finished work on behalf of krlmlr October 26, 2025 20:44
Copy link
Contributor

@krlmlr krlmlr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot: Merge main, resolve conflicts as needed, and obey updated instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug in assortativity_nominal

2 participants