Transpose from Rows into Column and sum of total_marks via pyspark
Input:
Output:
df = spark.read.format("csv").schema(schema).load("file:///home/cloudera/marks.txt")
df.show()
df.printSchema()
dfs = df.groupBy("Roll_No").pivot("Subject").sum("Marks")
dfs.show()
dfx = dfs.withColumn("total",expr("Computer+English+Maths+Science+Tamil"))
dfx.show()
Comments
Post a Comment