Questions and Answers

Question BJulJpSW9796OyyrOz35

Question

A data engineering team has noticed that their Databricks SQL queries are running too slowly when they are submitted to a non-running SQL endpoint. The data engineering team wants this issue to be resolved.

Which of the following approaches can the team use to reduce the time it takes to return results in this scenario?

Choices

  • A: They can turn on the Serverless feature for the SQL endpoint and change the Spot Instance Policy to “Reliability Optimized.”
  • B: They can turn on the Auto Stop feature for the SQL endpoint.
  • C: They can increase the cluster size of the SQL endpoint.
  • D: They can turn on the Serverless feature for the SQL endpoint.
  • E: They can increase the maximum bound of the SQL endpoint’s scaling range.

Question O7IFmDadbgrXMXoKzTgF

Question

A data engineer has a Job that has a complex run schedule, and they want to transfer that schedule to other Jobs.

Rather than manually selecting each value in the scheduling form in Databricks, which of the following tools can the data engineer use to represent and submit the schedule programmatically?

Choices

  • A: pyspark.sql.types.DateType
  • B: datetime
  • C: pyspark.sql.types.TimestampType
  • D: Cron syntax
  • E: There is no way to represent and submit this information programmatically

Question 5qyPnzCEZhkKbgaflWbN

Question

Which of the following approaches should be used to send the Databricks Job owner an email in the case that the Job fails?

Choices

  • A: Manually programming in an alert system in each cell of the Notebook
  • B: Setting up an Alert in the Job page
  • C: Setting up an Alert in the Notebook
  • D: There is no way to notify the Job owner in the case of Job failure
  • E: MLflow Model Registry Webhooks

Question 3CcvaOdjq4IVfTmGHpBr

Question

An engineering manager uses a Databricks SQL query to monitor ingestion latency for each data source. The manager checks the results of the query every day, but they are manually rerunning the query each day and waiting for the results.

Which of the following approaches can the manager use to ensure the results of the query are updated each day?

Choices

  • A: They can schedule the query to refresh every 1 day from the SQL endpoint’s page in Databricks SQL.
  • B: They can schedule the query to refresh every 12 hours from the SQL endpoint’s page in Databricks SQL.
  • C: They can schedule the query to refresh every 1 day from the query’s page in Databricks SQL.
  • D: They can schedule the query to run every 1 day from the Jobs UI.
  • E: They can schedule the query to run every 12 hours from the Jobs UI.

Question qGqbIVp1VpGEP3MbD8Ua

Question

In which of the following scenarios should a data engineer select a Task in the Depends On field of a new Databricks Job Task?

Choices

  • A: When another task needs to be replaced by the new task
  • B: When another task needs to fail before the new task begins
  • C: When another task has the same dependency libraries as the new task
  • D: When another task needs to use as little compute resources as possible
  • E: When another task needs to successfully complete before the new task begins