Questions and Answers

Question pZCjbhNLVZwieOp04fol

Question

A company is migrating its database servers from Amazon EC2 instances that run Microsoft SQL Server to Amazon RDS for Microsoft SQL Server DB instances. The company’s analytics team must export large data elements every day until the migration is complete. The data elements are the result of SQL joins across multiple tables. The data must be in Apache Parquet format. The analytics team must store the data in Amazon S3. Which solution will meet these requirements in the MOST operationally efficient way?

Choices

  • A: Create a view in the EC2 instance-based SQL Server databases that contains the required data elements. Create an AWS Glue job that selects the data directly from the view and transfers the data in Parquet format to an S3 bucket. Schedule the AWS Glue job to run every day.
  • B: Schedule SQL Server Agent to run a daily SQL query that selects the desired data elements from the EC2 instance-based SQL Server databases. Configure the query to direct the output .csv objects to an S3 bucket. Create an S3 event that invokes an AWS Lambda function to transform the output format from .csv to Parquet.
  • C: Use a SQL query to create a view in the EC2 instance-based SQL Server databases that contains the required data elements. Create and run an AWS Glue crawler to read the view. Create an AWS Glue job that retrieves the data and transfers the data in Parquet format to an S3 bucket. Schedule the AWS Glue job to run every day.
  • D: Create an AWS Lambda function that queries the EC2 instance-based databases by using Java Database Connectivity (JDBC). Configure the Lambda function to retrieve the required data, transform the data into Parquet format, and transfer the data into an S3 bucket. Use Amazon EventBridge to schedule the Lambda function to run every day.

Question 2URAied5iX7GSbkTIUW8

Question

A financial company wants to implement a data mesh. The data mesh must support centralized data governance, data analysis, and data access control. The company has decided to use AWS Glue for data catalogs and extract, transform, and load (ETL) operations. Which combination of AWS services will implement a data mesh? (Choose two.)

Choices

  • A: Use Amazon Aurora for data storage. Use an Amazon Redshift provisioned cluster for data analysis.
  • B: Use Amazon S3 for data storage. Use Amazon Athena for data analysis.
  • C: Use AWS Glue DataBrew for centralized data governance and access control.
  • D: Use Amazon RDS for data storage. Use Amazon EMR for data analysis.
  • E: Use AWS Lake Formation for centralized data governance and access control.

Question PO08YlrEEUelQ5bVBUlg

Question

A data engineering team is using an Amazon Redshift data warehouse for operational reporting. The team wants to prevent performance issues that might result from long- running queries. A data engineer must choose a system table in Amazon Redshift to record anomalies when a query optimizer identifies conditions that might indicate performance issues. Which table views should the data engineer use to meet this requirement?

Choices

  • A: STL_USAGE_CONTROL
  • B: STL_ALERT_EVENT_LOG
  • C: STL_QUERY_METRICS
  • D: STL_PLAN_INFO

Question paRzd9EgGbiHsKTNAVkC

Question

A data engineer must ingest a source of structured data that is in .csv format into an Amazon S3 data lake. The .csv files contain 15 columns. Data analysts need to run Amazon Athena queries on one or two columns of the dataset. The data analysts rarely query the entire file. Which solution will meet these requirements MOST cost-effectively?

Choices

  • A: Use an AWS Glue PySpark job to ingest the source data into the data lake in .csv format.
  • B: Create an AWS Glue extract, transform, and load (ETL) job to read from the .csv structured data source. Configure the job to ingest the data into the data lake in JSON format.
  • C: Use an AWS Glue PySpark job to ingest the source data into the data lake in Apache Avro format.
  • D: Create an AWS Glue extract, transform, and load (ETL) job to read from the .csv structured data source. Configure the job to write the data into the data lake in Apache Parquet format.

Question GqVcYZ9SIxMhYSMcyP1v

Question

A company has five offices in different AWS Regions. Each office has its own human resources (HR) department that uses a unique IAM role. The company stores employee records in a data lake that is based on Amazon S3 storage. A data engineering team needs to limit access to the records. Each HR department should be able to access records for only employees who are within the HR department’s Region. Which combination of steps should the data engineering team take to meet this requirement with the LEAST operational overhead? (Choose two.)

Choices

  • A: Use data filters for each Region to register the S3 paths as data locations.
  • B: Register the S3 path as an AWS Lake Formation location.
  • C: Modify the IAM roles of the HR departments to add a data filter for each department’s Region.
  • D: Enable fine-grained access control in AWS Lake Formation. Add a data filter for each Region.
  • E: Create a separate S3 bucket for each Region. Configure an IAM policy to allow S3 access. Restrict access based on Region.