What is the most efficient way to exchange high volume data between 2 process?

Asked
Active3 hr before
Viewed126 times

1 Answers

exchangeefficient
90%

When loading multiple files into a single table, use a single COPY command for the table, rather than multiple COPY commands. Amazon Redshift automatically parallelizes the data ingestion. Using a single COPY command to bulk load data into a table ensures optimal use of cluster resources, and quickest possible throughput.,Organizing the data into multiple, evenly sized files enables the COPY command to ingest this data using all available resources in the Amazon Redshift cluster. Further, the files are compressed (gzipped) to further reduce COPY times.,Because the downstream ETL processes depend on this COPY command to complete, the wlm_query_slot_count is used to claim all the memory available to the queue. This helps the COPY command complete as quickly as possible.,Amazon Redshift is designed to store and query petabyte-scale datasets. Using Amazon S3 you can stage and accumulate data from multiple source systems before executing a bulk COPY operation. The following methods allow efficient and fast transfer of these bulk datasets into Amazon Redshift:

Begin
CREATE temporary staging_table;
INSERT INTO staging_table SELECT..FROM source(transformation logic);
DELETE FROM daily_table WHERE dataset_date = ? ;
INSERT INTO daily_table SELECT..FROM staging_table(daily aggregate);
DELETE FROM weekly_table WHERE weekending_date = ? ;
INSERT INTO weekly_table SELECT..FROM staging_table(weekly aggregate);
Commit
load more v

Other "exchange-efficient" queries related to "What is the most efficient way to exchange high volume data between 2 process?"