PySpark Error - EOFError
I send almost the whole afternoon trying to debug an EOFError while implementing KMeansStreaming which uses a streaming THOR file as input. ( code ). I found the reason why I was getting the error. There were two reasons: The first clue was “java.lang.IllegalArgumentException: requirement failed” – which means the training and the testing data are not of the same dimension. The second clue was actually the EOFError – which is actually an out of memory error. The workaround was to increase the memory allocated using the option –executor-memory.
Comments
Post a Comment