Questions and Answers

Question EGDkICKOhbEAEYwJx5Gd

Question

Incorporating unit tests into a PySpark application requires upfront attention to the design of your jobs, or a potentially significant refactoring of existing code.

Which benefit offsets this additional effort?

Choices

  • A: Improves the quality of your data
  • B: Validates a complete use case of your application
  • C: Troubleshooting is easier since all steps are isolated and tested individually
  • D: Ensures that all steps interact correctly to achieve the desired end result

Question BVVA949iMldGOz9Fi96I

Question

What describes integration testing?

Choices

  • A: It validates an application use case.
  • B: It validates behavior of individual elements of an application,
  • C: It requires an automated testing framework.
  • D: It validates interactions between subsystems of your application.

Question tTFbNVzpPoerY5nqMHkt

Question

The Databricks CLI is used to trigger a run of an existing job by passing the job_id parameter. The response that the job run request has been submitted successfully includes a field run_id.

Which statement describes what the number alongside this field represents?

Choices

  • A: The job_id and number of times the job has been run are concatenated and returned.
  • B: The globally unique ID of the newly triggered run.
  • C: The number of times the job definition has been run in this workspace.
  • D: The job_id is returned in this field.

Question m0v8KDZYlM6jiSgNgAA8

Question

A Databricks job has been configured with three tasks, each of which is a Databricks notebook. Task A does not depend on other tasks. Tasks B and C run in parallel, with each having a serial dependency on task A.

What will be the resulting state if tasks A and B complete successfully but task C fails during a scheduled run?

Choices

  • A: All logic expressed in the notebook associated with tasks A and B will have been successfully completed; some operations in task C may have completed successfully.
  • B: Unless all tasks complete successfully, no changes will be committed to the Lakehouse; because task C failed, all commits will be rolled back automatically.
  • C: Because all tasks are managed as a dependency graph, no changes will be committed to the Lakehouse until all tasks have successfully been completed.
  • D: All logic expressed in the notebook associated with tasks A and B will have been successfully completed; any changes made in task C will be rolled back due to task failure.

Question 21EnX3xmZksEeqkBmYJt

Question

Which statement regarding stream-static joins and static Delta tables is correct?

Choices

  • A: Each microbatch of a stream-static join will use the most recent version of the static Delta table as of each microbatch.
  • B: Each microbatch of a stream-static join will use the most recent version of the static Delta table as of the job’s initialization.
  • C: The checkpoint directory will be used to track state information for the unique keys present in the join.
  • D: Stream-static joins cannot use static Delta tables because of consistency issues.
  • E: The checkpoint directory will be used to track updates to the static Delta table.