Skip to content

This repository covers data management and big data technologies, including databases, querying, and big data processing. Topics include Hadoop (MapReduce, HDFS), Apache Spark, data security, and optimization techniques. Students will learn Spark’s architecture, data distribution, parallel computing, and memory caching to enhance big data solutions

Notifications You must be signed in to change notification settings

Hazim-HF/Data-Management

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Management

  1. Pull docker image
    docker pull hortonworks/sandbox-hdp:2.6.5

  2. Run HDP Sandbox container
    docker run -d --name hdp-sandbox
    --hostname sandbox-hdp.hortonworks.com
    --privileged
    -p 8080:8080 -p 2222:22
    -p 80:80 -p 2181:2181 -p 8042:8042 -p 7077:7077 -p 8888:8888
    -p 50070:50070 -p 9000:9000 -p 10000:10000
    -p 9995:9995 -p 8020:8020
    hortonworks/sandbox-hdp:2.6.5

  3. View log
    docker logs -f hdp-sandbox
    output : Ambari Server 'STARTED'

  4. Use SSH into the Container
    ssh root@localhost -p 2222

  5. If port in use, change port using
    e.g., -p 15070:50070 http://localhost:15070

  6. Remove a container
    docker rm hdp-sandbox

  7. Remane container
    docker run --name sandbox-hdp2

  8. To start a sandbox later
    docker start hdp-sandbox

1080 - Ambari UI
4040 - Spark UI
8080 - Hive
8888 - Jupyter Notebook
9995 - Zeppelin
16010 - HBase
50070 - HDFS
8088 - YARN (Yet Another Resource Negotiator)

https://github.com/kontext-tech/winutils/blob/master/hadoop-3.4.0-win10-x64/bin/winutils.exe

About

This repository covers data management and big data technologies, including databases, querying, and big data processing. Topics include Hadoop (MapReduce, HDFS), Apache Spark, data security, and optimization techniques. Students will learn Spark’s architecture, data distribution, parallel computing, and memory caching to enhance big data solutions

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published