How do I convert column of unix epoch to Date in Apache spark DataFrame using Java? -
i have json data file contain 1 property [creationdate] unix epoc in "long" number type. apache spark dataframe schema below:
root |-- creationdate: long (nullable = true) |-- id: long (nullable = true) |-- posttypeid: long (nullable = true) |-- tags: array (nullable = true) | |-- element: string (containsnull = true) |-- title: string (nullable = true) |-- viewcount: long (nullable = true)
i groupby "creationdata_year" need "creationdate".
what's easiest way kind of convert in dataframe using java?
after checking spark dataframe api , sql function, come out below snippet:
dateframe df = sqlcontext.read().json("my_json_data_file"); dataframe df_dateconverted = df.withcolumn("creationdt", from_unixtime(stackoverflow_tags.col("creationdate").divide(1000))); the reason why "creationdate" column divided "1000" cause timeunit different. orgin "creationdate" unix epoch in "milli-second", spark sql "from_unixtime" designed handle unix epoch in "second".
Comments
Post a Comment