diff --git a/content/articles/2015/10/distributed-ad-hoc.md b/content/articles/2015/10/distributed-ad-hoc.md index 08da234..633dd18 100644 --- a/content/articles/2015/10/distributed-ad-hoc.md +++ b/content/articles/2015/10/distributed-ad-hoc.md @@ -15,8 +15,8 @@ Distributed The `distributed` project prototype provides distributed computing on a cluster in pure Python. -* [docs](http://distributed.readthedocs.org/en/latest/), - [source](http://github.com/mrocklin/distributed/), +* [docs](https://distributed.dask.org/en/latest/), + [source](https://github.com/mrocklin/distributed/), [chat](https://gitter.im/mrocklin/distributed) concurrent.futures interface @@ -109,8 +109,8 @@ As an example we perform a binary tree reduction on a sequence of random arrays. This is the kind of algorithm you would find hard-coded into a library like -[Spark](http://spark.apache.org/) or -[dask.array](http://dask.pydata.org/en/latest/array.html)/[dask.dataframe](http://dask.pydata.org/en/latest/dataframe.html) +[Spark](https://spark.apache.org/) or +[dask.array](https://docs.dask.org/en/latest/array.html)/[dask.dataframe](https://docs.dask.org/en/latest/dataframe.html) but that we can accomplish by hand with some for loops while still using parallel distributed computing. The difference here is that we're not limited to the algorithms chosen for us and can screw around more freely. @@ -156,15 +156,15 @@ Notes ----- Various other Python frameworks provide distributed function evaluation. A few -are listed [here](http://distributed.readthedocs.org/en/latest/related-work.html) +are listed [here](https://distributed.dask.org/en/latest/related-work.html) . Notably we're stepping on the toes of -[SCOOP](http://scoop.readthedocs.org/en/0.7/), an excellent library that also +[SCOOP](https://scoop.readthedocs.org/en/0.7/), an excellent library that also provides a distributed `concurrent.futures` interface. The `distributed` project could use a more distinct name. Any suggestions? For more information see the following links: -* [Documentation](http://distributed.readthedocs.org/en/latest/) -* [Source on Github](http://github.com/mrocklin/distributed/) +* [Documentation](https://distributed.dask.org/en/latest/) +* [Source on Github](https://github.com/mrocklin/distributed/) * [Gitter chat](https://gitter.im/mrocklin/distributed) diff --git a/content/articles/2015/10/distributed-hdfs.md b/content/articles/2015/10/distributed-hdfs.md index 1db09e7..f618c96 100644 --- a/content/articles/2015/10/distributed-hdfs.md +++ b/content/articles/2015/10/distributed-hdfs.md @@ -63,7 +63,7 @@ We put a dataset on HDFS instance through the command line interface: Then we query the namenode to discover how it sharded this file. To avoid JVM dependence we use Spotify's -[snakebite](http://snakebite.readthedocs.org/en/latest/) library which +[snakebite](https://snakebite.readthedocs.org/en/latest/) library which includes the protobuf headers necessary to interact with the namenode directly, without using the Java HDFS client library. @@ -159,8 +159,8 @@ think about remote hosts that have files on their local file systems. HDFS has played its part and can exit the stage. *Note: since writing this we've found a -[wonderful article](http://jvns.ca/blog/2014/05/15/diving-into-hdfs/) by -[Julia Evans](http://jvns.ca/), that describes a similar process.* +[wonderful article](https://jvns.ca/blog/2014/05/15/diving-into-hdfs/) by +[Julia Evans](https://jvns.ca/), that describes a similar process.* Data-local tasks with distributed @@ -198,7 +198,7 @@ Or alternatively we've wrapped up both steps into a little convenience function: ``` As a reminder from -[last time](http://blaze.pydata.org/blog/2015/10/27/distributed-ad-hoc/) these +[last time](https://blaze.pydata.org/blog/2015/10/27/distributed-ad-hoc/) these operations produce `Future` objects that point to remote results on the worker computers. This does not pull results back to local memory. We can use these futures in future computations with the executor. diff --git a/content/articles/2016/02/dask-distributed-1.md b/content/articles/2016/02/dask-distributed-1.md index ac6f93d..47dc369 100644 --- a/content/articles/2016/02/dask-distributed-1.md +++ b/content/articles/2016/02/dask-distributed-1.md @@ -36,7 +36,7 @@ cluster. We provision nine `m3.2xlarge` nodes on EC2. These have eight cores and 30GB of RAM each. On this cluster we provision one scheduler and nine workers (see -[setup docs](http://distributed.readthedocs.org/en/latest/setup.html)). (More +[setup docs](https://distributed.dask.org/en/latest/setup.html)). (More on launching in later posts.) We have five months of data, from 2015-01-01 to 2015-05-31 on the `githubarchive-data` bucket in S3. This data is publicly avaialble if you want to play with it on EC2. You can download the full @@ -105,7 +105,7 @@ records = e.persist(records) The data lives in S3 in hourly files as gzipped encoded, line delimited JSON. The `s3.read_text` and `text.map` functions produce -[dask.bag](http://dask.pydata.org/en/latest/bag.html) objects which track our +[dask.bag](https://docs.dask.org/en/latest/bag.html) objects which track our operations in a lazily built task graph. When we ask the executor to `persist` this collection we ship those tasks off to the scheduler to run on all of the workers in parallel. The `persist` function gives us back another `dask.bag` @@ -192,7 +192,7 @@ overhead. Investigate Jupyter ------------------- -We investigate the activities of [Project Jupyter](http://jupyter.org/). We +We investigate the activities of [Project Jupyter](https://jupyter.org/). We chose this project because it's sizable and because we understand the players involved and so can check our accuracy. This will require us to filter our data to a much smaller subset, then find popular repositories and members. @@ -489,10 +489,10 @@ done differently with more time. Links ----- -* [dask](https://dask.pydata.org/en/latest/), the original project -* [distributed](https://distributed.readthedocs.org/en/latest/), the +* [dask](https://docs.dask.org/en/latest/), the original project +* [distributed](https://distributed.dask.org/en/latest/), the distributed memory scheduler powering the cluster computing -* [dask.bag](http://dask.pydata.org/en/latest/bag.html), the user API we've +* [dask.bag](https://docs.dask.org/en/latest/bag.html), the user API we've used in this post. * This post largely repeats work by [Blake Griffith](https://github.com/cowlicks) in a [similar post](https://www.continuum.io/content/dask-distributed-and-anaconda-cluster) diff --git a/content/pages/overview.md b/content/pages/overview.md index 9cb2a05..12d118b 100644 --- a/content/pages/overview.md +++ b/content/pages/overview.md @@ -30,17 +30,17 @@ The following characteristics can define a particular Data Processing System: The goal of the Blaze ecosystem is to simplify data processing for users by providing: - A common language to describe data that it's independent of the Data Processing System, called -[**datashape**](http://blaze.github.io/pages/projects/datashape). +[**datashape**](https://blaze.pydata.org/pages/projects/datashape). - A common interface to query data that it's independent of the Data Processing System, called -[**blaze**](http://blaze.github.io/pages/projects/blaze). +[**blaze**](https://blaze.pydata.org/pages/projects/blaze). - A common utility library to move data from one format or system to another, called -[**odo**](http://blaze.github.io/pages/projects/odo). -- Compressed column stores, called [**bcolz**](http://blaze.github.io/pages/projects/bcolz) and -[**castra**](http://blaze.github.io/pages/projects/castra). -- A parallel computational engine, called [**dask**](http://blaze.github.io/pages/projects/dask). +[**odo**](https://blaze.pydata.org/pages/projects/odo). +- Compressed column stores, called [**bcolz**](https://blaze.pydata.org/pages/projects/bcolz) and +[**castra**](https://blaze.pydata.org/pages/projects/castra). +- A parallel computational engine, called [**dask**](https://blaze.pydata.org/pages/projects/dask). ## Learn more The project repositories can be found under the [Github Blaze Organization](https://github.com/blaze). Feel free to -reach out to the Blaze Developers through our mailing list, blaze-dev@continuum.io. \ No newline at end of file +reach out to the Blaze Developers through our mailing list, blaze-dev@continuum.io. diff --git a/content/pages/projects/blaze.md b/content/pages/projects/blaze.md index 1a69db2..97edaef 100644 --- a/content/pages/projects/blaze.md +++ b/content/pages/projects/blaze.md @@ -2,4 +2,4 @@ Title: Blaze Project: core Category: Projects Subtitle: An interface to query data on different storage systems -Docs: http://blaze.readthedocs.org/en/latest/index.html +Docs: https://blaze.readthedocs.org/en/latest/index.html diff --git a/content/pages/projects/dask.md b/content/pages/projects/dask.md index 33816f5..4f6053f 100644 --- a/content/pages/projects/dask.md +++ b/content/pages/projects/dask.md @@ -2,4 +2,4 @@ Title: Dask Project: core Category: Projects Subtitle: Parallel computing through task scheduling and blocked algorithms -Docs: http://dask.readthedocs.org/en/latest/ +Docs: https://docs.dask.org/en/latest/ diff --git a/content/pages/projects/datashape.md b/content/pages/projects/datashape.md index e325a03..21c1e9d 100644 --- a/content/pages/projects/datashape.md +++ b/content/pages/projects/datashape.md @@ -2,4 +2,4 @@ Title: Datashape Project: core Category: Projects Subtitle: A data description language -Docs: http://datashape.readthedocs.org/en/latest/ +Docs: https://datashape.readthedocs.io/en/latest/ diff --git a/content/pages/projects/odo.md b/content/pages/projects/odo.md index c919dd6..3d19c63 100644 --- a/content/pages/projects/odo.md +++ b/content/pages/projects/odo.md @@ -2,4 +2,4 @@ Title: Odo Project: core Category: Projects Subtitle: Data migration between different storage systems -Docs: http://odo.readthedocs.org/en/latest/ \ No newline at end of file +Docs: https://odo.readthedocs.io/en/latest/ diff --git a/content/pages/talks/ep2015-blaze.md b/content/pages/talks/ep2015-blaze.md index c5b349f..a04ef1e 100644 --- a/content/pages/talks/ep2015-blaze.md +++ b/content/pages/talks/ep2015-blaze.md @@ -5,5 +5,5 @@ Category: Talks Tags: blaze,dask,odo,datashape Video: https://www.youtube.com/embed/QKBcnEhkCtk Site: https://ep2015.europython.eu/conference/talks/scale-your-data-not-your-process-welcome-to-the-blaze-ecosystem -Slides: http://chdoig.github.io/ep2015-blaze/ +Slides: https://chdoig.github.io/ep2015-blaze/ diff --git a/pelicanconf.py b/pelicanconf.py index 0960c3d..8b3415e 100644 --- a/pelicanconf.py +++ b/pelicanconf.py @@ -4,7 +4,7 @@ AUTHOR = u'Blaze Developers' SITENAME = u'The Blaze Ecosystem' -SITEURL = 'http://blaze.github.io/' +SITEURL = 'https://blaze.github.io/' TITLE = 'The Blaze Ecosystem' SUBTITLE = 'Connecting people to data' diff --git a/theme/templates/base.html b/theme/templates/base.html index 12400a7..f6c5016 100644 --- a/theme/templates/base.html +++ b/theme/templates/base.html @@ -15,8 +15,8 @@ - - + + @@ -29,7 +29,7 @@ - + @@ -72,4 +72,4 @@ –––––––––––––––––––––––––––––––––––––––––––––––––– --> - \ No newline at end of file + diff --git a/theme/templates/includes/comments.html b/theme/templates/includes/comments.html index dee2434..735b1b0 100755 --- a/theme/templates/includes/comments.html +++ b/theme/templates/includes/comments.html @@ -15,9 +15,9 @@ (document.getElementsByTagName('head')[0] || document.getElementsByTagName('body')[0]).appendChild(dsq); })(); -