hadoop - Hive on Tez Pushdown Predicate doesn't work in view using window function on partitioned table -
using hive on tez running query against view causes full table scan though there partition on regionid , id. query in cloudera impala takes 0.6s complete , using hortonworks data platform , hive on tez takes 800s. i've come conclusion in hive on tez using window function prevents predicate pushed down inner select causing full table scan.
create view latestposition t1 ( select *, row_number() on ( partition regionid, id, deviceid order ts desc) rownos positions ) select * t1 rownos = 1; select * latestposition regionid='1d6a0be1-6366-4692-9597-ebd5cd0f01d1' , id=1422792010 , deviceid='6c5d1a30-2331-448b-a726-a380d6b3a432'; i've tried joining table using max function latest record, works, , finishes in few seconds still slow use case. if remove window function predicate gets pushed down , return in milliseconds.
if has ideas appreciated.
for interested, posted question on hortonworks community forum. guys on there raised bug issue on hive jira , actively working on it.
Comments
Post a Comment