milvus/internal/util
shaoting-huang 88b373b024
enhance: binlog primary key turn off dict encoding (#34358)
issue: #34357 

Go Parquet uses dictionary encoding by default, and it will fall back to
plain encoding if the dictionary size exceeds the dictionary size page
limit. Users can specify custom fallback encoding by using
`parquet.WithEncoding(ENCODING_METHOD)` in writer properties. However,
Go Parquet [fallbacks to plain
encoding](e65c1e295d/go/parquet/file/column_writer_types.gen.go.tmpl (L238))
rather than custom encoding method users provide. Therefore, this patch
only turns off dictionary encoding for the primary key.

With a 5 million auto ID primary key benchmark, the parquet file size
improves from 13.93 MB to 8.36 MB when dictionary encoding is turned
off, reducing primary key storage space by 40%.

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-07-17 17:47:44 +08:00
..
analyzecgowrapper enhance: Support analyze data (#33651) 2024-06-06 17:37:51 +08:00
bloomfilter enhance: Use BatchPkExist to reduce bloom filter func call cost (#33611) 2024-06-13 17:57:56 +08:00
cgo enhance: async search and retrieve in cgo (#33228) 2024-06-22 09:38:02 +08:00
cgoconverter
clustering Add an option to enable/disable vector field clustering key (#34097) 2024-06-25 18:52:04 +08:00
componentutil enhance: improve check health (#33800) 2024-07-01 10:16:06 +08:00
dependency enhance: move rocksmq from internal to pkg module (#33881) 2024-06-25 21:18:15 +08:00
exprutil feat: support partition key isolation (#34336) 2024-07-11 19:01:35 +08:00
flowgraph fix: Correct the update logic of timerecorder (#34339) 2024-07-04 16:34:17 +08:00
funcutil
grpcclient enhance: Force to reset coord connection for unavailable error (#33908) 2024-06-18 14:53:59 +08:00
hookutil enhance: add the config to control the way when fail to init plugin (#32680) 2024-05-07 11:01:31 +08:00
importutilv2 enhance: binlog primary key turn off dict encoding (#34358) 2024-07-17 17:47:44 +08:00
indexcgowrapper feat: Major compaction (#33620) 2024-06-10 21:34:08 +08:00
initcore fix: Pass otlpSecure config when setup segcore tracing (#34193) 2024-06-26 19:18:04 +08:00
metrics
mock enhance: Rename Compaction to CompactionV2 (#33858) 2024-06-16 22:07:57 +08:00
pipeline enhance: move rocksmq from internal to pkg module (#33881) 2024-06-25 21:18:15 +08:00
proxyutil enhance: update shard leader cache when leader location changed (#32470) 2024-05-08 10:05:29 +08:00
quota feat: support rate limiter based on db and partition levels (#31070) 2024-04-12 16:01:19 +08:00
ratelimitutil feat: support rate limiter based on db and partition levels (#31070) 2024-04-12 16:01:19 +08:00
segmentutil
sessionutil fix: streaming service related fix patch (#34696) 2024-07-16 15:49:38 +08:00
streamingutil fix: ut failure for grpc upgrade (#34726) 2024-07-16 21:49:40 +08:00
streamrpc enhance: Merge query stream result for reduce delete task (#32855) 2024-05-27 18:15:43 +08:00
testutil enhance: Optimize bulk insert unittest (#33224) 2024-05-24 10:23:41 +08:00
tsoutil enhance: move rocksmq from internal to pkg module (#33881) 2024-06-25 21:18:15 +08:00
typeutil feat: Major compaction (#33620) 2024-06-10 21:34:08 +08:00
wrappers