Query Engines

Starrocks

Ressources

configs

hudi

external catalog

metadata sync

code

So the FE get file listing for partitions at runtime. Depending on the table type, it merges logs files or not. It could also get the file listing from the hudi metada table.

caching

There is 3 mechanisms:

  1. Query cache: only used with native tables. It stores intermediate results, not the final ones
  2. Storage cache: only used with native tables stored on cloud storage. It stores the new data also locally
  3. Block cache: only used for external tables on cloud storage. It stores files locally, either disk or ram.

Athena

From doc: Use ORC for complex types Currently, when you query columns stored in Parquet that have complex data types (for example, array, map, or struct), Athena reads an entire row of data instead of selectively reading only the specified columns. This is a known issue in Athena. As a workaround, consider using ORC

jdbc

Details in the simbra jdbc manual

jdbc:awsathena://User=[AccessKey];Password= 
[SecretKey];S3OutputLocation=[Output];[Property1]=[Value1]; 
[Property2]=[Value2];...

dremio

React ?

This page was last modified: