The wordcount_spark.py
program we wrote earlier finds the word that is used the most times in the input text. It did this by doing a sum reduction
using the add
operator. You job is to modify this program using a different kind of reduction in order to count the number of distinct words in
the input text.
Call your new program distinct_spark.py
and commit it to the repository you used for Assignment 3.