Questions and Answers
Question k8eQttkIK5BDO9afSVeO
Question
A data analysis team has noticed that their Databricks SQL queries are running too slowly when connected to their always-on SQL endpoint. They claim that this issue is present when many members of the team are running small queries simultaneously. They ask the data engineering team for help. The data engineering team notices that each of the team’s queries uses the same SQL endpoint.
Which approach can the data engineering team use to improve the latency of the team’s queries?
Choices
- A: They can increase the cluster size of the SQL endpoint.
- B: They can increase the maximum bound of the SQL endpoint’s scaling range.
- C: They can turn on the Auto Stop feature for the SQL endpoint.
- D: They can turn on the Serverless feature for the SQL endpoint.
answer?
Answer: B Answer_ET: B Community answer B (83%) D (17%) Discussion
Comment 1295497 by Jugiboss
- Upvotes: 3
Selected Answer: B It’s always on, no need for serverless to speed up start-up.
Comment 1273027 by 7082935
- Upvotes: 2
Selected Answer: B The question states that the developers are connected to their “always-on” SQL Endpoint. This means there is no startup delay. We can increase performance of many simultaneous queries by scaling out.
Comment 1272963 by 9d4d68a
- Upvotes: 1
further,
A. Increase the cluster size of the SQL endpoint: While increasing the cluster size might help if the current cluster size is insufficient, it does not necessarily address the scaling needs dynamically. The SQL endpoint might still be limited by its scaling configuration.
C. Turn on the Auto Stop feature: Auto Stop helps manage costs by automatically stopping the SQL endpoint when it is idle. However, it doesn’t address performance issues related to simultaneous query execution and would not improve query latency directly.
D. Turn on the Serverless feature: The Serverless SQL endpoint is designed for ad-hoc querying without requiring dedicated clusters. While it could help in certain scenarios, it may not be directly applicable if the issue is specifically related to high concurrency and resource contention in an always-on environment.
By increasing the scaling range, the SQL endpoint can handle more concurrent queries and improve overall performance.
Comment 1265495 by 80370eb
- Upvotes: 1
Selected Answer: D Turning on the Serverless feature allows the SQL endpoint to scale automatically and efficiently handle a large number of small queries, improving performance and reducing latency.
Question 3dJzaPIlAAgF2Tm7R3e9
Question
A data engineer wants to schedule their Databricks SQL dashboard to refresh once per day, but they only want the associated SQL endpoint to be running when it is necessary.
Which approach can the data engineer use to minimize the total running time of the SQL endpoint used in the refresh schedule of their dashboard?
Choices
- A: They can ensure the dashboard’s SQL endpoint matches each of the queries’ SQL endpoints.
- B: They can set up the dashboard’s SQL endpoint to be serverless.
- C: They can turn on the Auto Stop feature for the SQL endpoint.
- D: They can ensure the dashboard’s SQL endpoint is not one of the included query’s SQL endpoint.
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1327326 by MultiCloudIronMan
- Upvotes: 1
Selected Answer: C The correct answer is C. They can turn on the Auto Stop feature for the SQL endpoint. This feature ensures that the SQL endpoint automatically stops when it is not in use, minimizing the total running time and reducing costs.
Question bmTM97yk8WstpK8Ezu0P
Question
An engineering manager wants to monitor the performance of a recent project using a Databricks SQL query. For the first week following the project’s release, the manager wants the query results to be updated every minute. However, the manager is concerned that the compute resources used for the query will be left running and cost the organization a lot of money beyond the first week of the project’s release.
Which approach can the engineering team use to ensure the query does not cost the organization any money beyond the first week of the project’s release?
Choices
- A: They can set a limit to the number of DBUs that are consumed by the SQL Endpoint.
- B: They can set the query’s refresh schedule to end after a certain number of refreshes.
- C: They can set the query’s refresh schedule to end on a certain date in the query scheduler.
- D: They can set a limit to the number of individuals that are able to manage the query’s refresh schedule.
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1320576 by Worldmaster
- Upvotes: 1
Selected Answer: C C. They can set the query’s refresh schedule to end on a certain date in the query scheduler. This is the best solution. By setting the refresh schedule to automatically stop on a specific date (e.g., one week after the project release), the engineering team ensures that the query will only refresh during the desired period, preventing unnecessary costs after that date. This automated stopping of the refresh process avoids the need for manual intervention after the project’s first week.
Question Ar5UZG5yeTJumc5CuSTR
Question
A new data engineering team team has been assigned to an ELT project. The new data engineering team will need full privileges on the table sales to fully manage the project.
Which command can be used to grant full permissions on the database to the new data engineering team?
Choices
- A: GRANT ALL PRIVILEGES ON TABLE sales TO team;
- B: GRANT SELECT CREATE MODIFY ON TABLE sales TO team;
- C: GRANT SELECT ON TABLE sales TO team;
- D: GRANT ALL PRIVILEGES ON TABLE team TO sales;
answer?
Answer: A Answer_ET: A Community answer A (100%) Discussion
Comment 1282748 by CommanderBigMac
- Upvotes: 1
Selected Answer: A A is correct. Grant all on table sales to team, not table team to sales.
Question ND681lHcOs0wu5R1W62o
Question
Differentiate between all-purpose clusters and jobs clusters.
A data engineering team has created a python notebook to load data from cloud storage, this job has been tested and now needs to be scheduled in production.
Which would be the best cluster to be used in this case?
Choices
- A: All purpose cluster
- B: Any Unity Catalog-enabled cluster
- C: Jobs Cluster
- D: Serverless SQL warehouse
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1327334 by MultiCloudIronMan
- Upvotes: 1
Selected Answer: C Jobs clusters, on the other hand, are designed specifically for running production jobs. They are ephemeral, meaning they are created when a job starts and terminated when the job completes. This makes them more cost-effective for scheduled jobs, as they do not incur costs when not in use.
Comment 1320577 by Worldmaster
- Upvotes: 1
Selected Answer: C Jobs clusters are specifically designed to run scheduled jobs in a production environment. They are ephemeral, meaning they are created when a job starts and terminated once the job finishes. This makes them cost-efficient and optimized for batch processing.