Questions and Answers

Question STisj3OFt3RFwY0AQ2mf

Question

A company needs to load customer data that comes from a third party into an Amazon Redshift data warehouse. The company stores order data and product data in the same data warehouse. The company wants to use the combined dataset to identify potential new customers.

A data engineer notices that one of the fields in the source data includes values that are in JSON format.

How should the data engineer load the JSON data into the data warehouse with the LEAST effort?

Choices

  • A: Use the SUPER data type to store the data in the Amazon Redshift table.
  • B: Use AWS Glue to flatten the JSON data and ingest it into the Amazon Redshift table.
  • C: Use Amazon S3 to store the JSON data. Use Amazon Athena to query the data.
  • D: Use an AWS Lambda function to flatten the JSON data. Store the data in Amazon S3.

Question p5Nx7wJQnTJDEeUmM0TW

Question

A company wants to analyze sales records that the company stores in a MySQL database. The company wants to correlate the records with sales opportunities identified by Salesforce.

The company receives 2 GB of sales records every day. The company has 100 GB of identified sales opportunities. A data engineer needs to develop a process that will analyze and correlate sales records and sales opportunities. The process must run once each night.

Which solution will meet these requirements with the LEAST operational overhead?

Choices

  • A: Use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to fetch both datasets. Use AWS Lambda functions to correlate the datasets. Use AWS Step Functions to orchestrate the process.
  • B: Use Amazon AppFlow to fetch sales opportunities from Salesforce. Use AWS Glue to fetch sales records from the MySQL database. Correlate the sales records with the sales opportunities. Use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to orchestrate the process.
  • C: Use Amazon AppFlow to fetch sales opportunities from Salesforce. Use AWS Glue to fetch sales records from the MySQL database. Correlate the sales records with sales opportunities. Use AWS Step Functions to orchestrate the process.
  • D: Use Amazon AppFlow to fetch sales opportunities from Salesforce. Use Amazon Kinesis Data Streams to fetch sales records from the MySQL database. Use Amazon Managed Service for Apache Flink to correlate the datasets. Use AWS Step Functions to orchestrate the process.

Question aXAUEh2ywppsV5443j9d

Question

A company stores server logs in an Amazon S3 bucket. The company needs to keep the logs for 1 year. The logs are not required after 1 year.

A data engineer needs a solution to automatically delete logs that are older than 1 year.

Which solution will meet these requirements with the LEAST operational overhead?

Choices

  • A: Define an S3 Lifecycle configuration to delete the logs after 1 year.
  • B: Create an AWS Lambda function to delete the logs after 1 year.
  • C: Schedule a cron job on an Amazon EC2 instance to delete the logs after 1 year.
  • D: Configure an AWS Step Functions state machine to delete the logs after 1 year.

Question DFX3YMoq7MDaAYt1BUFI

Question

A company is designing a serverless data processing workflow in AWS Step Functions that involves multiple steps. The processing workflow ingests data from an external API, transforms the data by using multiple AWS Lambda functions, and loads the transformed data into Amazon DynamoDB.

The company needs the workflow to perform specific steps based on the content of the incoming data.

Which Step Functions state type should the company use to meet this requirement?

Choices

  • A: Parallel
  • B: Choice
  • C: Task
  • D: Map

Question i8SAYg4VG6Qo2kb1d0wl

Question

A data engineer created a table named cloudtrail_logs in Amazon Athena to query AWS CloudTrail logs and prepare data for audits. The data engineer needs to write a query to display errors with error codes that have occurred since the beginning of 2024. The query must return the 10 most recent errors.

Which query will meet these requirements?

Choices

  • A: select count (*) as TotalEvents, eventname, errorcode, errormessage from cloudtrail_logswhere errorcode is not nulland eventtime >= ‘2024-01-01T00:00:00Z’ group by eventname, errorcode, errormessageorder by TotalEvents desclimit 10;
  • B: select count (*) as TotalEvents, eventname, errorcode, errormessage from cloudtrail_logs where eventtime >= ‘2024-01-01T00:00:00Z’ group by eventname, errorcode, errormessage order by TotalEvents desc limit 10;
  • C: select count (*) as TotalEvents, eventname, errorcode, errormessage from cloudtrail_logswhere eventtime >= ‘2024-01-01T00:00:00Z’ group by eventname, errorcode, errormessageorder by eventname asc limit 10;
  • D: select count (*) as TotalEvents, eventname, errorcode, errormessage from cloudtrail_logs where errorcode is not nulland eventtime >= ‘2024-01-01T00:00:00Z’ group by eventname, errorcode, errormessagelimit 10;