Questions and Answers

Question nXpfVxJb9ucmtzFhRX3l

Question

When scheduling Structured Streaming jobs for production, which configuration automatically recovers from query failures and keeps costs low?

Choices

  • A: Cluster: New Job Cluster; Retries: Unlimited; Maximum Concurrent Runs: 1
  • B: Cluster: New Job Cluster; Retries: Unlimited; Maximum Concurrent Runs: Unlimited
  • C: Cluster: Existing All-Purpose Cluster; Retries: Unlimited; Maximum Concurrent Runs: 1
  • D: Cluster: New Job Cluster; Retries: None; Maximum Concurrent Runs: 1

Question J518MSBKdYmq6kts6s7u

Question

A Delta Lake table was created with the below query:

//IMG//

Realizing that the original query had a typographical error, the below code was executed:

ALTER TABLE prod.sales_by_stor RENAME TO prod.sales_by_store

Which result will occur after running the second command?

Choices

  • A: The table reference in the metastore is updated.
  • B: All related files and metadata are dropped and recreated in a single ACID transaction.
  • C: The table name change is recorded in the Delta transaction log.
  • D: A new Delta transaction log is created for the renamed table.

Question bwvxveoqZoTVk1M38Neu

Question

The data engineering team has configured a Databricks SQL query and alert to monitor the values in a Delta Lake table. The recent_sensor_recordings table contains an identifying sensor_id alongside the timestamp and temperature for the most recent 5 minutes of recordings.

The below query is used to create the alert:

//IMG//

The query is set to refresh each minute and always completes in less than 10 seconds. The alert is set to trigger when mean (temperature) > 120. Notifications are triggered to be sent at most every 1 minute.

If this alert raises notifications for 3 consecutive minutes and then stops, which statement must be true?

Choices

  • A: The total average temperature across all sensors exceeded 120 on three consecutive executions of the query
  • B: The average temperature recordings for at least one sensor exceeded 120 on three consecutive executions of the query
  • C: The source query failed to update properly for three consecutive minutes and then restarted
  • D: The maximum temperature recording for at least one sensor exceeded 120 on three consecutive executions of the query

Question qx4WW5rFC8h3DpzNFxaM

Question

The Databricks workspace administrator has configured interactive clusters for each of the data engineering groups. To control costs, clusters are set to terminate after 30 minutes of inactivity. Each user should be able to execute workloads against their assigned clusters at any time of the day.

Assuming users have been added to a workspace but not granted any permissions, which of the following describes the minimal permissions a user would need to start and attach to an already configured cluster.

Choices

  • A: “Can Manage” privileges on the required cluster
  • B: Cluster creation allowed, “Can Restart” privileges on the required cluster
  • C: Cluster creation allowed, “Can Attach To” privileges on the required cluster
  • D: “Can Restart” privileges on the required cluster

Question ejZvjeE9crg30V9dsovh

Question

The data science team has created and logged a production model using MLflow. The following code correctly imports and applies the production model to output the predictions as a new DataFrame named preds with the schema “customer_id LONG, predictions DOUBLE, date DATE”.

//IMG//

The data science team would like predictions saved to a Delta Lake table with the ability to compare all predictions across time. Churn predictions will be made at most once per day.

Which code block accomplishes this task while minimizing potential compute costs?

Choices

  • A: preds.write.mode(“append”).saveAsTable(“churn_preds”)
  • B: preds.write.format(“delta”).save(“/preds/churn_preds”)
  • C:
  • D: