R Project Sprint 2023 - Dendrapply Refactor #6
Replies: 7 comments 4 replies
-
Just wanted to add a little more info here since I've been thinking about this--this project shouldn't take the entire sprint in time. If this completes with time to spare, I'd like to investigate improving other functionality with I'm also definitely open to other suggestions and/or seeing if we can identify other opportunities for improvement during the sprint! |
Beta Was this translation helpful? Give feedback.
-
got the doc - will read over it and get back to you...
…On Thu, Aug 17, 2023 at 2:00 PM Aidan Lakshman ***@***.***> wrote:
Hi Dr. Gentleman,
I'm ecstatic that you're interested in helping with this! I think some of
the topics you mentioned are briefly addressed in the document submitted
and the bugzilla report, but I will work on putting together a more
comprehensive document on specifics and comparable implementations and post
it here prior to the sprint.
-Aidan
—
Reply to this email directly, view it on GitHub
<#6 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AC7TWA7YFMO7DXTFUXYUZQTXVZL3PANCNFSM6AAAAAAZ2CCWAY>
.
You are receiving this because you commented.Message ID:
***@***.***
com>
--
Robert Gentleman
***@***.***
|
Beta Was this translation helpful? Give feedback.
-
Dendrograms are a very special kind of graph and it might be better to use existing code and algorithms. Could you describe some use cases - just to get a sense of scale (how many leaves and how deep?). What do you want to do to each node? Are internal nodes interesting or do you need to get to the leaves to do something? How many leaves do you want to handle? Are you sure you want to do this in custom C, there is the Boost library eg https://www.boost.org, which has some high quality C++ code. Which is one alternative. I certainly agree that the list like structure used in the current dendrogram code might not be the best one for traversal. It has been a few years since I interacted with that code base, but it has been very good and well maintained. If you look at either the graph package (or I suspect the igraph package) you will see implementations of DFS and BFS search. It might be worth trying to look at more general approaches to graph traversal, especially where high quality algorithms already exist. And also considering whether you want either of those, both(?) or something else. I admit to not having looked at any of the phylogeny code yet - but will try to track that down. |
Beta Was this translation helpful? Give feedback.
-
TLDR: I am happy to collaborate next week Robert (@rgentlem) and I have e-talked about this a bit. As he won't be present physically next week and both Aidan @ahl27 and I are planning to and I had originally created the function And yes, using Boost from base R's stats is currently out of the question, even using the Matrix package's sparse matrices for this may be cumbersome: stats depending on Matrix which currently strongly depends on stats. |
Beta Was this translation helpful? Give feedback.
-
Hi Aidan,
For pretty much all the use cases you have described, there are already functions available: E.g. ape alone offers three obvious functions to subset a tree ( And there exist at least three different widely used implementations of circular plots for Also However phylogenetic trees might
I had mixed experiences even getting some of these into a dendrogram object. And some functions in Probably it would be a good idea to improve |
Beta Was this translation helpful? Give feedback.
-
Hi Klaus, Thanks for your comment! I'm a big fan of I'm very familiar with the To your points on this project--I think there are a couple of counterpoints for consideration. First is regarding the state of the My personal view is that having that package set be externally maintained is good overall for R and phylogenetics, but I'm happy to discuss that more (either virtually or in-person, if you'll be at the sprint). The larger point here is I'm not convinced that updates to The second point is that I also think it would help to clarify the goal of this project. This project started specifically because the help page for When we look at other built in objects like lists or vectors, they have ways to construct them, ways to subset them, and ways to apply functions to them. I'm not completely decided on what those "base capabilities" look like. I threw out circular plotting because I think it's an interesting feature that would be a fairly easy addition to the current plotting system, but in my mind the only main two things that are really missing are a better However, all of that is definitely debatable. One of the reasons I really wanted to come to this sprint was to discuss if those are reasonable expectations, or if we should be doing something different.
Thanks again for taking the time to give input on this. Looking forward to hearing from you, and please come say hi if you'll be attending the sprint! -Aidan |
Beta Was this translation helpful? Give feedback.
-
See latest update at #29 (comment) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
R Project Sprint 2023 - Dendrapply Refactor
https://contributor.r-project.org/r-project-sprint-2023/projects/dendrapply-refactor/
Beta Was this translation helpful? Give feedback.
All reactions