ReadMe
Initial Configuration: ❖ Programming Language : Python ❖ Execution Environment : Google Cloud ❖ Programming Framework : Apache Spark
File Attached:
- WordCountAssignment : Contains code for all the three parts of the assignment.
- Part1Output.txt : Sample output file for part 1 of the assignment
- Part2Output.txt : Sample Output file for part 2 of the assignment
- Part3Output.txt : Output file for part 3 of the assignment
- Results and Snapshots : Contains results and associated snapshots. The results were received as expected.
Technical Specifications: I used pyspark library of python for writing the code. For the purpose of solving above problems i made use of functions like map(), collect(), join(), reduceByKey() etc. The detailed explanation is documented in the WordCountAssignment and results and snapshot file.