Skip to content

Conversation

Infinage
Copy link
Contributor

@Infinage Infinage commented Jun 8, 2023

Closes: #300

Description

What is the purpose of this pull request?

This pull request:

  • Implements the RFC proposal for adding the penguin dataset as an alternative to the iris dataset.

Related Issues

Does this pull request have any related issues?

This pull request:

Questions

Any questions for reviewers of this pull request?

Unsure about the licensing related information for this dataset, I had simply cloned these details from pace-boston-house-prices dataset with minor modifications. Might need some extra eyes reviewing the LICENSE & README.md's LICENSE section. Also wanted to know if needed to get in touch with the dataset authors for permissions?

Other

Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.

Data was cloned from this repo URL and the column names were altered to follow camel Case conventions. No other changes were made to the data.

Checklist

Please ensure the following tasks are completed before submitting this pull request.


@stdlib-js/reviewers

Copy link
Member

@Planeshifter Planeshifter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you very much for you PR! I left a few minor requests and change suggestions. I didn't see any issues with respect to how the licenses are presented.

If applied, this commit will bring in the changes to add the
penguin dataset as an alternative to the iris dataset

Closes: #300
@Infinage
Copy link
Contributor Author

Infinage commented Jun 9, 2023

@Planeshifter - thanks for the review! I have made all the suggested changes.

As an add on, I had tried to create a scatter plot visualizing the different species with Flipper Length vs Body Mass in the examples/index.html with this code, but I was not able to get the labels to show up. Could you please let me know if we have some inbuilt parameters that we could use for this? Also I am only seeing the same color for all label instances.

Code:

// Extract Penguins data...
x = [];
y = [];
species = [];
for ( i = 0; i < data.length; i++ ) {
	flipperLen = data[ i ].flipperLength;
	bodyMass = data[ i ].bodyMass;
	speciesType = data[ i ].species;
	if ( flipperLen !== null && bodyMass !== null && speciesType !== null ) {
		x.push( flipperLen );
		y.push( bodyMass );
		species.push( speciesMapping[ speciesType ] );
	}
}

// Create a plot instance:
opts = {
	'lineStyle': 'none',
	'symbols': 'closed-circle',
	'xLabel': 'Flipper Length (mm)',
	'yLabel': 'Body Mass (g)',
	'title': 'Flipper Length & Body Mass for Adelie, Chinstrap & Gentoo Penguins'
};
plot = new Plot( [ x ], [ y ], opts );

plot.width = 650;
plot.height = 480;
plot.colors = 'category20';
plot.labels = species;

// Render the plot:
console.log( plot.render( 'html' ) );

Output of the above code:

image

What I am hoping to achieve (minus the best fit line):

Palmer Penguins Scatterplot

@Infinage Infinage requested a review from Planeshifter June 9, 2023 08:45
@kgryte
Copy link
Member

kgryte commented Jun 11, 2023

@Infinage No need to add an examples/index.html file. If we were to generate a plot, we'd do so by opening an Electron window.

@Infinage
Copy link
Contributor Author

Got you, so do I need to have this file removed?

@kgryte kgryte added Feature Issue or pull request for adding a new feature. Needs Review A pull request which needs code review. Needs Changes Pull request which needs changes before being merged. labels Feb 23, 2024
@kgryte kgryte changed the title feat: add datasets/penguins feat: add datasets/penguins Feb 24, 2024
@Planeshifter
Copy link
Member

Did another review; I ended up updating the copyright year and making a few small adjustments. But would be great to finally get this nice dataset in! 🚀

@Planeshifter Planeshifter removed the Needs Changes Pull request which needs changes before being merged. label Sep 7, 2024
@kgryte
Copy link
Member

kgryte commented Sep 7, 2024

@Planeshifter It looks to me like the README references are not formatted correctly. These should also be added to the bibliography database.

@Infinage Infinage closed this by deleting the head repository Oct 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature Issue or pull request for adding a new feature. Needs Review A pull request which needs code review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RFC: Add penguin dataset

3 participants