About 7,130,000 results
Open links in new tab
  1. PySpark: multiple conditions in when clause - Stack Overflow

    Jun 8, 2016 · when in pyspark multiple conditions can be built using & (for and) and | (for or). Note:In pyspark t is important to enclose every expressions within parenthesis () that combine …

  2. pyspark - How to use AND or OR condition in when in Spark

    107 pyspark.sql.functions.when takes a Boolean Column as its condition. When using PySpark, it's often useful to think "Column Expression" when you read "Column". Logical operations on …

  3. Comparison operator in PySpark (not equal/ !=) - Stack Overflow

    Aug 24, 2016 · Comparison operator in PySpark (not equal/ !=) Asked 9 years, 2 months ago Modified 1 year, 8 months ago Viewed 164k times

  4. Rename more than one column using withColumnRenamed

    Since pyspark 3.4.0, you can use the withColumnsRenamed() method to rename multiple columns at once. It takes as an input a map of existing column names and the corresponding …

  5. python - PySpark: "Exception: Java gateway process exited before ...

    I'm trying to run PySpark on my MacBook Air. When I try starting it up, I get the error: Exception: Java gateway process exited before sending the driver its port number when sc = …

  6. python - Spark Equivalent of IF Then ELSE - Stack Overflow

    python apache-spark pyspark apache-spark-sql edited Dec 10, 2017 at 1:43 Community Bot 1 1

  7. Pyspark replace strings in Spark dataframe column

    For Spark 1.5 or later, you can use the functions package: from pyspark.sql.functions import regexp_replace newDf = df.withColumn('address', regexp_replace('address', 'lane', 'ln')) Quick …

  8. Pyspark dataframe LIKE operator - Stack Overflow

    Oct 24, 2016 · What is the equivalent in Pyspark for LIKE operator? For example I would like to do: SELECT * FROM table WHERE column LIKE "*somestring*"; looking for something easy …

  9. pyspark: rolling average using timeseries data - Stack Overflow

    Aug 22, 2017 · pyspark: rolling average using timeseries data Asked 8 years, 2 months ago Modified 6 years, 2 months ago Viewed 77k times

  10. python - Concatenate two PySpark dataframes - Stack Overflow

    May 20, 2016 · Utilize simple unionByName method in pyspark, which concats 2 dataframes along axis 0 as done by pandas concat method. Now suppose you have df1 with columns id, …