Questions and Answers

Question ULHo9HLLi3Q5IqiL0hzi

Question

A company loads transaction data for each day into Amazon Redshift tables at the end of each day. The company wants to have the ability to track which tables have been loaded and which tables still need to be loaded. A data engineer wants to store the load statuses of Redshift tables in an Amazon DynamoDB table. The data engineer creates an AWS Lambda function to publish the details of the load statuses to DynamoDB. How should the data engineer invoke the Lambda function to write load statuses to the DynamoDB table?

Choices

  • A: Use a second Lambda function to invoke the first Lambda function based on Amazon CloudWatch events.
  • B: Use the Amazon Redshift Data API to publish an event to Amazon EventBridge. Configure an EventBridge rule to invoke the Lambda function.
  • C: Use the Amazon Redshift Data API to publish a message to an Amazon Simple Queue Service (Amazon SQS) queue. Configure the SQS queue to invoke the Lambda function.
  • D: Use a second Lambda function to invoke the first Lambda function based on AWS CloudTrail events.

Question Iu0xZ0VfliO1qEqOGtO5

Question

A company receives test results from testing facilities that are located around the world. The company stores the test results in millions of 1 KB JSON files in an Amazon S3 bucket. A data engineer needs to process the files, convert them into Apache Parquet format, and load them into Amazon Redshift tables. The data engineer uses AWS Glue to process the files, AWS Step Functions to orchestrate the processes, and Amazon EventBridge to schedule jobs.

The company recently added more testing facilities. The time required to process files is increasing. The data engineer must reduce the data processing time.

Which solution will MOST reduce the data processing time?

Choices

  • A: Use AWS Lambda to group the raw input files into larger files. Write the larger files back to Amazon S3. Use AWS Glue to process the files. Load the files into the Amazon Redshift tables.
  • B: Use the AWS Glue dynamic frame file-grouping option to ingest the raw input files. Process the files. Load the files into the Amazon Redshift tables.
  • C: Use the Amazon Redshift COPY command to move the raw input files from Amazon S3 directly into the Amazon Redshift tables. Process the files in Amazon Redshift.
  • D: Use Amazon EMR instead of AWS Glue to group the raw input files. Process the files in Amazon EMR. Load the files into the Amazon Redshift tables.

Question 0cMk3YyDdi2I9p7JQY7R

Question

A data engineer uses Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to run data pipelines in an AWS account.

A workflow recently failed to run. The data engineer needs to use Apache Airflow logs to diagnose the failure of the workflow.

Which log type should the data engineer use to diagnose the cause of the failure?

Choices

  • A: YourEnvironmentName-WebServer
  • B: YourEnvironmentName-Scheduler
  • C: YourEnvironmentName-DAGProcessing
  • D: YourEnvironmentName-Task

Question UvwwSX81mqcpmbINLquT

Question

A finance company uses Amazon Redshift as a data warehouse. The company stores the data in a shared Amazon S3 bucket. The company uses Amazon Redshift Spectrum to access the data that is stored in the S3 bucket. The data comes from certified third-party data providers. Each third-party data provider has unique connection details.

To comply with regulations, the company must ensure that none of the data is accessible from outside the company’s AWS environment.

Which combination of steps should the company take to meet these requirements? (Choose two.)

Choices

  • A: Replace the existing Redshift cluster with a new Redshift cluster that is in a private subnet. Use an interface VPC endpoint to connect to the Redshift cluster. Use a NAT gateway to give Redshift access to the S3 bucket.
  • B: Create an AWS CloudHSM hardware security module (HSM) for each data provider. Encrypt each data provider’s data by using the corresponding HSM for each data provider.
  • C: Turn on enhanced VPC routing for the Amazon Redshift cluster. Set up an AWS Direct Connect connection and configure a connection between each data provider and the finance company’s VPC.
  • D: Define table constraints for the primary keys and the foreign keys.
  • E: Use federated queries to access the data from each data provider. Do not upload the data to the S3 bucket. Perform the federated queries through a gateway VPC endpoint.

Question gZLQNKxvlJKrveq37BGa

Question

Files from multiple data sources arrive in an Amazon S3 bucket on a regular basis. A data engineer wants to ingest new files into Amazon Redshift in near real time when the new files arrive in the S3 bucket.

Which solution will meet these requirements?

Choices

  • A: Use the query editor v2 to schedule a COPY command to load new files into Amazon Redshift.
  • B: Use the zero-ETL integration between Amazon Aurora and Amazon Redshift to load new files into Amazon Redshift.
  • C: Use AWS Glue job bookmarks to extract, transform, and load (ETL) load new files into Amazon Redshift.
  • D: Use S3 Event Notifications to invoke an AWS Lambda function that loads new files into Amazon Redshift.