Azure SQL DW CTAS of over 102,400 rows to one distribution doesn't automatically compress -
i thought way columnstores worked if bulk load on 102,400 rows 1 distribution of columnstore, automatically compress it. i'm not observing in azure sql dw.
i'm doing following ctas statement:
create table columnstoredemoctas (clustered columnstore index, distribution=hash(column1)) select top 102401 cast(1 int) column1, f.* factinternetsales f cross join sys.objects o1 cross join sys.objects o2 now check status of columnstore row groups:
select t.name ,ni.distribution_id ,csrowgroups.state_description ,csrowgroups.total_rows ,csrowgroups.deleted_rows sys.tables t join sys.indexes on t.object_id = i.object_id join sys.pdw_index_mappings indexmap on i.object_id = indexmap.object_id , i.index_id = indexmap.index_id join sys.pdw_nodes_indexes ni on indexmap.physical_name = ni.name , indexmap.index_id = ni.index_id left join sys.pdw_nodes_column_store_row_groups csrowgroups on csrowgroups.object_id = ni.object_id , csrowgroups.pdw_node_id = ni.pdw_node_id , csrowgroups.distribution_id = ni.distribution_id , csrowgroups.index_id = ni.index_id t.name = 'columnstoredemoctas' order 1,2,3,4 desc; i end 1 open rowgroup 102401 rows. did misunderstand behavior of columnstores? azure sql dw different?
i see same behavior if bulk insert ssis of same number of rows 1 buffer.
i tried drew's suggestion of inserting on 6.5 million rows , still end open row stores:
create table columnstoredemowide (clustered columnstore index, distribution=hash(column1)) select top 7000000 row_number() on (order f.productkey) column1, f.* factinternetsales f cross join sys.objects o cross join sys.objects o2 cross join sys.objects o3
this defect in service. fix being rolled out. if try out on japan west example see behaviour expect.
Comments
Post a Comment