hadoop-lzo.jar is preinstalled on EMR, and in order to be able to split indexed LZO files when reading them (and have one Spark partition per block size instead of one Spark partition per file), we need to use sc.newAPIHadoopFile() to read them instead of sc.textFile() as it's currently the case.
More info can be found here.