hadoop - Hive on Tez Pushdown Predicate doesn't work in view using window function on partitioned table -


using hive on tez running query against view causes full table scan though there partition on regionid , id. query in cloudera impala takes 0.6s complete , using hortonworks data platform , hive on tez takes 800s. i've come conclusion in hive on tez using window function prevents predicate pushed down inner select causing full table scan.

create view latestposition t1 (   select *, row_number() on ( partition regionid, id, deviceid order ts desc) rownos positions  ) select * t1 rownos = 1;   select * latestposition  regionid='1d6a0be1-6366-4692-9597-ebd5cd0f01d1' , id=1422792010 , deviceid='6c5d1a30-2331-448b-a726-a380d6b3a432'; 

i've tried joining table using max function latest record, works, , finishes in few seconds still slow use case. if remove window function predicate gets pushed down , return in milliseconds.

if has ideas appreciated.

for interested, posted question on hortonworks community forum. guys on there raised bug issue on hive jira , actively working on it.

https://community.hortonworks.com/questions/8880/hive-on-tez-pushdown-predicate-doesnt-work-in-part.html

https://issues.apache.org/jira/browse/hive-12808


Comments

Popular posts from this blog

how to insert data php javascript mysql with multiple array session 2 -

multithreading - Exception in Application constructor -

windows - CertCreateCertificateContext returns CRYPT_E_ASN1_BADTAG / 8009310b -