About 8,730,000 results
Open links in new tab
  1. python - PySpark: "Exception: Java gateway process exited before ...

    I'm trying to run PySpark on my MacBook Air. When I try starting it up, I get the error: Exception: Java gateway process exited before sending the driver its port number when sc = …

  2. Rename more than one column using withColumnRenamed

    Since pyspark 3.4.0, you can use the withColumnsRenamed() method to rename multiple columns at once. It takes as an input a map of existing column names and the corresponding …

  3. How to change dataframe column names in PySpark?

    I come from pandas background and am used to reading data from CSV files into a dataframe and then simply changing the column names to something useful using the simple command: …

  4. python - Spark Equivalent of IF Then ELSE - Stack Overflow

    python apache-spark pyspark apache-spark-sql edited Dec 10, 2017 at 1:43 Community Bot 1 1

  5. Pyspark replace strings in Spark dataframe column

    Pyspark replace strings in Spark dataframe column Asked 9 years, 6 months ago Modified 1 year ago Viewed 314k times

  6. PySpark: withColumn () with two conditions and three outcomes

    The withColumn function in pyspark enables you to make a new variable with conditions, add in the when and otherwise functions and you have a properly working if then else structure.

  7. Retrieve top n in each group of a DataFrame in pyspark

    2 I know the question is asked for pyspark and I was looking for the similar answer in Scala i.e. Retrieve top n values in each group of a DataFrame in Scala Here is the scala version of …

  8. Pyspark: display a spark data frame in a table format

    Pyspark: display a spark data frame in a table format Asked 9 years, 3 months ago Modified 2 years, 3 months ago Viewed 412k times

  9. pyspark: ValueError: Some of types cannot be determined after …

    pyspark: ValueError: Some of types cannot be determined after inferring Asked 9 years ago Modified 1 year, 6 months ago Viewed 141k times

  10. pyspark : NameError: name 'spark' is not defined

    Alternatively, you can use the pyspark shell where spark (the Spark session) as well as sc (the Spark context) are predefined (see also NameError: name 'spark' is not defined, how to solve?).