Transpose from Rows into Column and sum of total_marks via pyspark

Transpose from Rows into Column and sum of total_marks via pyspark

Input:


Output:


Solve:

 df = spark.read.format("csv").schema(schema).load("file:///home/cloudera/marks.txt")

df.show()

df.printSchema()

dfs = df.groupBy("Roll_No").pivot("Subject").sum("Marks")

dfs.show()

dfx = dfs.withColumn("total",expr("Computer+English+Maths+Science+Tamil"))

dfx.show()


Comments