You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Tweak the license file so GitHub recognizes it.
- Fix a mistake in the manifest file so the change log is included as intended.
- Update the default Amazon Linux 2 AMI.
- Update and trim the main README a bit.
- Adopt pyproject.toml. It is "strongly recommended" and commands like python setup.py sdist bdist_wheel are deprecated in favor of python -m build.
- Trim outdated comments and pin of cryptography from setup.py.
- Update testing code for setting up private VPC.
*[#348], [#367]: Bumped default Spark to 3.5.0 and default Hadoop to 3.3.6; dropped support for Python 3.6 and 3.7; added CI builds for Python 3.10, 3.11, and 3.12.
10
16
*[#361]: Migrated from AdoptOpenJDK, which is deprecated, to Adoptium OpenJDK.
11
17
*[#362], [#366]: Improved Flintrock's ability to cleanup after launch failures.
12
18
*[#366]: Deprecated `--ec2-spot-request-duration`, which is not needed for one-time spot instances launched using the RunInstances API.
19
+
*[#369]: Adopted `pyproject.toml` and tweaked Flintrock's Python packaging accordingly. This keeps Flintrock in line with modern Python packaging standards and should be transparent to end-users.
If you want to [contribute](https://github.com/nchammas/flintrock/blob/master/CONTRIBUTING.md), follow the instructions in our contributing guide on [how to install Flintrock](https://github.com/nchammas/flintrock/blob/master/CONTRIBUTING.md#contributing-code).
@@ -203,17 +197,17 @@ There are some things that Flintrock specifically *does not* support.
203
197
204
198
Flintrock is not for managing long-lived clusters, or any infrastructure that serves as a permanent part of some environment.
205
199
206
-
For starters, Flintrock provides no guarantee that clusters launched with one version of Flintrock can be managed by another version of Flintrock, and no considerations are made for any long-term use cases.
200
+
For starters, Flintrock provides no guarantee that clusters launched with one version of Flintrock can be managed by another version of Flintrock, and no considerations are made for any long-term use cases.
207
201
208
-
If you are looking for ways to manage permanent infrastructure, look at tools like [Terraform](https://www.terraform.io/), [Ansible](http://www.ansible.com/), [SaltStack](http://saltstack.com/), or [Ubuntu Juju](http://www.ubuntu.com/cloud/tools/juju). You might also find a service like [Databricks](https://databricks.com/product/databricks) useful if you're looking for someone else to host and manage Spark for you. Amazon also offers [Spark on EMR](https://aws.amazon.com/elasticmapreduce/details/spark/).
202
+
If you are looking for ways to manage permanent infrastructure, look at tools like [Terraform](https://www.terraform.io/), [Ansible](http://www.ansible.com/), or [Ubuntu Juju](http://www.ubuntu.com/cloud/tools/juju). You might also find a service like [Databricks](https://databricks.com/product/databricks) useful if you're looking for someone else to host and manage Spark for you. Amazon also offers [Spark on EMR](https://aws.amazon.com/elasticmapreduce/details/spark/).
209
203
210
204
### Launching non-Spark-related services
211
205
212
-
Flintrock is meant for launching Spark clusters that include closely related services like HDFS, Mesos, and YARN.
206
+
Flintrock is meant for launching Spark clusters that include closely related services like HDFS.
213
207
214
-
Flintrock is not for launching external datasources (e.g. Cassandra), or other services that are not closely integrated with Spark (e.g. Tez).
208
+
Flintrock is not for launching external datasources (e.g. Cassandra), or other services that are not closely integrated with Spark (e.g. Tez).
215
209
216
-
If you are looking for an easy way to launch other services from the Hadoop ecosystem, look at the [Apache Bigtop](http://bigtop.apache.org/) project.
210
+
If you are looking for an easy way to launch other services from the Hadoop ecosystem, look at the [Apache Bigtop](http://bigtop.apache.org/) project.
Flintrock is really fast. This is how quickly it can launch fully operational clusters on EC2 compared to [spark-ec2](https://github.com/amplab/spark-ec2).
287
-
288
-
#### Setup
289
-
290
-
* Provider: EC2
291
-
* Instance type: `m3.large`
292
-
* AMI:
293
-
* Flintrock: [Default Amazon Linux AMI](https://aws.amazon.com/amazon-linux-ami/)
The spark-ec2 launch times are sourced from [SPARK-5189](https://issues.apache.org/jira/browse/SPARK-5189).
307
-
308
-
Note that AWS performance is highly variable, so you will not get these results consistently. They show the best case scenario for each tool, and not the typical case. For Flintrock, the typical launch time will be a minute or two longer.
280
+
Flintrock is really fast. It can launch a 100-node cluster in about three minutes (give or take a few seconds due to AWS's normal performance variability).
309
281
310
282
### Advanced Storage Setup
311
283
@@ -330,7 +302,7 @@ Flintrock is built and tested against vanilla Amazon Linux and CentOS. You can e
330
302
331
303
Supporting multiple versions of anything is tough. There's more surface area to cover for testing, and over the long term the maintenance burden of supporting something non-current with bug fixes and workarounds really adds up.
332
304
333
-
There are projects that support stuff across a wide cut of language or API versions. For example, Spark supports Java 7 and 8, and Python 2.6+ and 3+. The people behind these projects are gods. They take on an immense maintenance burden for the benefit and convenience of their users.
305
+
There are projects that support stuff across a wide cut of language or API versions. For example, Spark supports multiple versions of Java, Scala, R, and Python. The people behind these projects are gods. They take on an immense maintenance burden for the benefit and convenience of their users.
334
306
335
307
We here at project Flintrock are much more modest in our abilities. We are best able to serve the project over the long term when we limit ourselves to supporting a small but widely applicable set of configurations.
0 commit comments