Questions and Answers
Question zDtHu7du1HTzxqYyMDeg
Question
A retail company uses an Amazon Redshift data warehouse and an Amazon S3 bucket. The company ingests retail order data into the S3 bucket every day.
The company stores all order data at a single path within the S3 bucket. The data has more than 100 columns. The company ingests the order data from a third-party application that generates more than 30 files in CSV format every day. Each CSV file is between 50 and 70 MB in size.
The company uses Amazon Redshift Spectrum to run queries that select sets of columns. Users aggregate metrics based on daily orders. Recently, users have reported that the performance of the queries has degraded. A data engineer must resolve the performance issues for the queries.
Which combination of steps will meet this requirement with LEAST developmental effort? (Choose two.)
Choices
- A: Configure the third-party application to create the files in a columnar format.
- B: Develop an AWS Glue ETL job to convert the multiple daily CSV files to one file for each day.
- C: Partition the order data in the S3 bucket based on order date.
- D: Configure the third-party application to create the files in JSON format.
- E: Load the JSON data into the Amazon Redshift table in a SUPER type column.
answer?
Answer: AC Answer_ET: AC Community answer AC (67%) BC (33%) Discussion
Comment 1362356 by Ell89
- Upvotes: 1
Selected Answer: AC using parqueet or ORC is efficient and so will be partitioning by order date so the range of data is lower
Comment 1358484 by italiancloud2025
- Upvotes: 1
Selected Answer: BC No, porque la opción A implica modificar la aplicación de terceros para que genere archivos en formato columnar, lo cual puede ser más complejo o inviable, mientras que la opción B utiliza un job de Glue para consolidar los CSV sin tocar la fuente. La opción C sigue siendo esencial para particionar por fecha y optimizar las consultas.
Comment 1317242 by emupsx1
- Upvotes: 1
Selected Answer: AC https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html
Question H6apOuWdnjeGjcKzklBG
Question
A company stores customer records in Amazon S3. The company must not delete or modify the customer record data for 7 years after each record is created. The root user also must not have the ability to delete or modify the data.
A data engineer wants to use S3 Object Lock to secure the data.
Which solution will meet these requirements?
Choices
- A: Enable governance mode on the S3 bucket. Use a default retention period of 7 years.
- B: Enable compliance mode on the S3 bucket. Use a default retention period of 7 years.
- C: Place a legal hold on individual objects in the S3 bucket. Set the retention period to 7 years.
- D: Set the retention period for individual objects in the S3 bucket to 7 years.
answer?
Answer: B Answer_ET: B Community answer B (100%) Discussion
Comment 1341158 by MerryLew
- Upvotes: 2
Selected Answer: B “In compliance mode, a protected object version can’t be overwritten or deleted by any user, including the root user in your AWS account. When an object is locked in compliance mode, its retention mode can’t be changed, and its retention period can’t be shortened. Compliance mode helps ensure that an object version can’t be overwritten or deleted for the duration of the retention period.”
Comment 1317246 by emupsx1
- Upvotes: 3
Selected Answer: B https://aws.amazon.com/s3/features/object-lock/
Question 0I5mbuFfIuSm1i2vG5Zh
Question
A data engineer needs to create a new empty table in Amazon Athena that has the same schema as an existing table named old_table.
Which SQL statement should the data engineer use to meet this requirement?
Choices
- A: CREATE TABLE new_table AS SELECT * FROM old_tables;
- B: INSERT INTO new_table SELECT * FROM old_table;
- C: CREATE TABLE new_table (LIKE old_table);
- D: CREATE TABLE new_table AS (SELECT * FROM old_table) WITH NO DATA;
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1309172 by AgboolaKun
- Upvotes: 2
Selected Answer: D D is the correct answer.
Here is why:
The AS clause allows you to define the new table’s schema based on a SELECT statement.
The WITH NO DATA clause at the end explicitly tells Athena to create the table structure without copying any data.
For more information, see the “Creating an empty copy of an existing table” section in this documentation - https://docs.aws.amazon.com/athena/latest/ug/ctas-examples.html
Comment 1305777 by Eleftheriia
- Upvotes: 1
Selected Answer: D with no data, creates a new table with the same schema as the old one. https://docs.aws.amazon.com/athena/latest/ug/create-table-as.html
Comment 1305419 by pikuantne
- Upvotes: 1
Selected Answer: D D is correct
Comment 1303477 by Parandhaman_Margan
- Upvotes: 1
Answer: D
Question cRVgnAk0leTLEkJ44Ri6
Question
A data engineer needs to create an Amazon Athena table based on a subset of data from an existing Athena table named cities_world. The cities_world table contains cities that are located around the world. The data engineer must create a new table named cities_us to contain only the cities from cities_world that are located in the US.
Which SQL statement should the data engineer use to meet this requirement?
Choices
- A: INSERT INTO cities_usa (city,state) SELECT city, state FROM cities_world WHERE country=’usa’;
- B: MOVE city, state FROM cities_world TO cities_usa WHERE country=’usa’;
- C: INSERT INTO cities_usa SELECT city, state FROM cities_world WHERE country=’usa’;
- D: UPDATE cities_usa SET (city, state) = (SELECT city, state FROM cities_world WHERE country=’usa’);
answer?
Answer: A Answer_ET: A Community answer A (100%) Discussion
Comment 1400036 by saqib839
- Upvotes: 1
Selected Answer: A but he said create table so neither of them is true.
Comment 1341160 by MerryLew
- Upvotes: 1
Selected Answer: A The INSERT INTO SELECT statement copies data from one table and inserts it into another table.
Comment 1305892 by Eleftheriia
- Upvotes: 4
Selected Answer: A INSERT INTO cities_usa (city,state) SELECT city,state FROM cities_world WHERE country=‘usa’
Comment 1305334 by truongnguyen86
- Upvotes: 1
Selected Answer: A should be A or C but C will failed if cities_usa contains more than 2 columns so specify the list of column want to insert is the good one.
Comment 1303478 by Parandhaman_Margan
- Upvotes: 1
Answer:D
Question PvaK00CG9s2z5cr8ApIg
Question
A company implements a data mesh that has a central governance account. The company needs to catalog all data in the governance account. The governance account uses AWS Lake Formation to centrally share data and grant access permissions.
The company has created a new data product that includes a group of Amazon Redshift Serverless tables. A data engineer needs to share the data product with a marketing team. The marketing team must have access to only a subset of columns. The data engineer needs to share the same data product with a compliance team. The compliance team must have access to a different subset of columns than the marketing team needs access to.
Which combination of steps should the data engineer take to meet these requirements? (Choose two.)
Choices
- A: Create views of the tables that need to be shared. Include only the required columns.
- B: Create an Amazon Redshift data share that includes the tables that need to be shared.
- C: Create an Amazon Redshift managed VPC endpoint in the marketing team’s account. Grant the marketing team access to the views.
- D: Share the Amazon Redshift data share to the Lake Formation catalog in the governance account.
- E: Share the Amazon Redshift data share to the Amazon Redshift Serverless workgroup in the marketing team’s account.
answer?
Answer: BD Answer_ET: BD Community answer BD (57%) D (43%) Discussion
Comment 1307023 by Eleftheriia
- Upvotes: 3
Selected Answer: D I think that D is one of the correct answers as described
Comment 1304346 by ae35a02
- Upvotes: 2
i think not A because they say access and permissions are centrally managed in Lake Formation … therefore : BD
Comment 1303971 by ae35a02
- Upvotes: 4
Selected Answer: BD workgroups don’t manage permission to tables and views, they manage resource allocation for queries execution.
Comment 1303480 by Parandhaman_Margan
- Upvotes: 2
Answer:AD