Questions and Answers
Question WGaFObuXnaOgyZiQDkyF
Question
A data engineer needs to access the view created by the sales team, using a shared cluster. The data engineer has been provided usage permissions on the catalog and schema. In order to access the view created by sales team.
What are the minimum permissions the data engineer would require in addition?
Choices
- A: Needs SELECT permission on the VIEW and the underlying TABLE.
- B: Needs SELECT permission only on the VIEW
- C: Needs ALL PRIVILEGES on the VIEW
- D: Needs ALL PRIVILEGES at the SCHEMA level
answer?
Answer: B Answer_ET: B Community answer B (67%) A (33%) Discussion
Comment 1328541 by san089
- Upvotes: 1
Selected Answer: B To read a view, the permissions required depend on the compute type, Databricks Runtime version, and access mode:
For all compute resources, you must have SELECT on the view itself, USE CATALOG on its parent catalog, and USE SCHEMA on its parent schema. This applies to all compute types that support Unity Catalog, including SQL warehouses, clusters in shared access mode, and clusters in single user access mode on Databricks Runtime 15.4 and above.
For clusters on Databricks Runtime 15.3 and below that use single user access mode, you must also have SELECT on all tables and views that are referenced by the view, in addition to USE CATALOG on their parent catalogs and USE SCHEMA on their parent schemas.
Comment 1326480 by Rinscy
- Upvotes: 1
Selected Answer: B B and key here is “Shared Cluster”. On a single cluster with a runtime prior to 15.4 it will need the permissions on view and tables. With Shared Access Cluster, only on the view.
Comment 1318611 by Pirate_boid
- Upvotes: 2
Selected Answer: B For a view one does not need the permissions on the underlying table
Comment 1315801 by Medkalys
- Upvotes: 2
Selected Answer: A In Databricks Unity Catalog, permissions are hierarchical and must cover all data objects involved. If a user needs to query a view, the following conditions apply:
SELECT permission on the VIEW: Allows the user to query the view itself. SELECT permission on the underlying TABLE(s): Views depend on the underlying tables or data sources. The user must also have SELECT permissions on these tables to access the data exposed by the view.
Comment 1313035 by SajadAhm
- Upvotes: 3
B is correct. in databricks partner platform, it shows privileges on a view and says: as you see, no one has access to this table, but we could give access to the view without giving access to the underlying table. this is one of the main advantages of views.
Question tyYSr1qhAOFTqEmvLaab
Question
Which method should a Data Engineer apply to ensure Workflows are being triggered on schedule?
Choices
- A: Scheduled Workflows require an always-running cluster, which is more expensive but reduces processing latency.
- B: Scheduled Workflows process data as it arrives at configured sources.
- C: Scheduled Workflows can reduce resource consumption and expense since the cluster runs only long enough to execute the pipeline.
- D: Scheduled Workflows run continuously until manually stopped.
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1327380 by MultiCloudIronMan
- Upvotes: 2
Selected Answer: C The correct answer is C. Scheduled Workflows can reduce resource consumption and expense since the cluster runs only long enough to execute the pipeline. This method ensures that the cluster is only active for the duration of the workflow execution, minimizing resource usage and costs.
Question VQpX3T8qMmmaCb2DwIzv
Question
The Delta transaction log for the ‘students’ tables is shown using the ‘DESCRIBE HISTORY students’ command. A Data Engineer needs to query the table as it existed before the UPDATE operation listed in the log.
Which command should the Data Engineer use to achieve this? (Choose two.)
//IMG//
Choices
- A: SELECT * FROM students@v4
- B: SELECT * FROM students TIMESTAMP AS OF ‘2024-04-22T 14:32:47.000+00:00’
- C: SELECT * FROM students FROM HISTORY VERSION AS OF 3
- D: SELECT * FROM students VERSION AS OF 5
- E: SELECT * FROM students TIMESTAMP AS OF ‘2024-04-22T 14:32:58.000+00:00’
answer?
Answer: AB Answer_ET: AB Community answer AB (83%) BD (17%) Discussion
Comment 1334036 by CaoMengde09
- Upvotes: 2
Selected Answer: AB SELECT * FROM people10m VERSION AS OF 123; Is identical to SELECT * FROM people10m@v123;
so . SELECT * FROM students@v4 is the same as running . SELECT * FROM students@v4 VERSION AS OF 5
[“A”, “B”]
Comment 1329999 by DipeshGandhi131
- Upvotes: 2
Selected Answer: AB https://docs.databricks.com/en/delta/history.html#delta-time-travel-syntax
Comment 1327384 by MultiCloudIronMan
- Upvotes: 1
Selected Answer: BD Option A (SELECT * FROM students@v4) is not correct because the syntax students@v4 is not valid in SQL for querying a specific version of a Delta table. The correct syntax to query a specific version of a Delta table is to use the VERSION AS OF or TIMESTAMP AS OF clauses.
Therefore, the correct options are:
B. SELECT * FROM students TIMESTAMP AS OF ‘2024-04-22T 14:32:47.000+00:00’
D. SELECT * FROM students VERSION AS OF 5
Comment 1320589 by Worldmaster
- Upvotes: 1
Selected Answer: AB AB correct https://docs.databricks.com/en/delta/history.html
Question Me4u56mwCkH5izMPyEgE
Question
An engineering manager uses a Databricks SQL query to monitor ingestion latency for each data source. The manager checks the results of the query every day, but they are manually rerunning the query each day and waiting for the results.
Which of the following approaches can the manager use to ensure the results of the query are updated each day?
Choices
- A: They can schedule the query to refresh every 1 day from the SQL endpoint’s page in Databricks SQL.
- B: They can schedule the query to refresh every 12 hours from the SQL endpoint’s page in Databricks SQL.
- C: They can schedule the query to refresh every 1 day from the query’s page in Databricks SQL.
- D: They can schedule the query to run every 12 hours from the Jobs UI.
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1327386 by MultiCloudIronMan
- Upvotes: 2
Selected Answer: C The correct answer is C. They can schedule the query to refresh every 1 day from the query’s page in Databricks SQL. This approach ensures that the query results are automatically updated each day without the need for manual intervention.
Question ILDQ79FBUW7OY6aWyErm
Question
Which of the following benefits is provided by the array functions from Spark SQL?
Choices
- A: An ability to work with data in a variety of types at once
- B: An ability to work with data within certain partitions and windows
- C: An ability to work with time-related data in specified intervals
- D: An ability to work with complex, nested data ingested from JSON files
- E: An ability to work with an array of tables for procedural automation
answer?
Answer: D Answer_ET: D Community answer D (96%) 4% Discussion
Comment 945996 by Atnafu
- Upvotes: 11
Array functions in Spark SQL allow you to work with complex, nested data ingested from JSON files. These functions can be used to extract data from nested structures, manipulate data within nested structures, and aggregate data within nested structures.
The other options are not benefits provided by the array functions from Spark SQL.
Option A: Array functions do not allow you to work with data in a variety of types at once. Option B: Array functions do not allow you to work with data within certain partitions and windows. Option C: Array functions do not allow you to work with time-related data in specified intervals. Option E: Array functions do not allow you to work with an array of tables for procedural automation. Therefore, the only benefit provided by the array functions from Spark SQL is the ability to work with complex, nested data ingested from JSON files.
Comment 1335020 by danishanis
- Upvotes: 1
Selected Answer: D D is correct because: Array functions in Spark SQL are particularly useful for working with complex, nested data structures, such as those commonly found in JSON files. These functions allow you to manipulate and query arrays and nested data within your DataFrame, making it easier with Hierarchical data.
Option A is not specific to array functions. Spark SQL provides the ability to work with various array functions. Option B is an ability related to window functions and partitioning in Spark SQL, not specifically to array functions. Window functions allow you to perform operations across a set of table rows that are somehow related to the current row Option C is an ability related to time functions and interval operations in Spark SQL and not specific to array functions. Option E is an ability not specific to array functions as Spark SQL does not provide direct support for working with an array of tables for procedural automation through array functions.
Comment 1314109 by 806e7d2
- Upvotes: 2
Selected Answer: D Spark SQL array functions are particularly useful for working with complex and nested data structures, such as arrays, which are often found in semi-structured data formats like JSON. These functions allow users to manipulate and process array data directly, making it easier to handle nested structures without needing to flatten them upfront.
Comment 1262401 by 80370eb
- Upvotes: 2
Selected Answer: D D. An ability to work with complex, nested data ingested from JSON files
Array functions in Spark SQL allow you to work with complex and nested data structures, such as those found in JSON files, enabling operations on arrays and nested elements.
Comment 1249379 by ranjan24
- Upvotes: 1
D is the correct one
Comment 1246043 by ranjan24
- Upvotes: 1
The correct Answer is D
Comment 1244540 by 3fbc31b
- Upvotes: 1
Selected Answer: D The correct answer is D.
Comment 1213172 by BharaniRaj
- Upvotes: 1
Selected Answer: D D is the right answeer
Comment 1189114 by benni_ale
- Upvotes: 1
Selected Answer: D i thought sql arrays are usually seen in json files read
Comment 1177196 by Itmma
- Upvotes: 1
Selected Answer: E E is correct
Comment 1113195 by SerGrey
- Upvotes: 1
Correct answer is D
Comment 1109094 by Garyn
- Upvotes: 4
Selected Answer: D D. An ability to work with complex, nested data ingested from JSON files
Array functions in Spark SQL enable users to work efficiently with arrays and complex, nested data structures that are often ingested from JSON files or other nested data formats. These functions allow manipulation, querying, and extraction of elements from arrays and nested structures within the dataset, facilitating operations on complex data types within Spark SQL.
Comment 1071089 by Huroye
- Upvotes: 1
Correct answer is D. Array provides complex nesting of data and it is easy to query. That’s why we use arrays for definding data domains.
Comment 1064800 by awofalus
- Upvotes: 1
Selected Answer: D D is correct
Comment 1028779 by VijayKula
- Upvotes: 1
Selected Answer: D D is the correct answer
Comment 1020506 by chris_mach
- Upvotes: 1
Selected Answer: D array functions allow you to work with JSON data
Comment 1017353 by KalavathiP
- Upvotes: 1
Selected Answer: D D is right ans
Comment 997919 by vctrhugo
- Upvotes: 3
Selected Answer: D D. An ability to work with complex, nested data ingested from JSON files
Array functions in Spark SQL are primarily used for working with arrays and complex, nested data structures, such as those often encountered when ingesting JSON files. These functions allow you to manipulate and query nested arrays and structures within your data, making it easier to extract and work with specific elements or values within complex data formats.
While some of the other options (such as option A for working with different data types) are features of Spark SQL or SQL in general, array functions specifically excel at handling complex, nested data structures like those found in JSON files.
Comment 897238 by prasioso
- Upvotes: 2
Selected Answer: D Correct answer is D. Spark SQL Array functions allow us to work with nested datasets in JSON files
Comment 876212 by Varma_Saraswathula
- Upvotes: 1
Option D
Comment 875870 by naxacod574
- Upvotes: 1
Option D
Comment 860636 by sdas1
- Upvotes: 1
option D
Comment 859669 by surrabhi_4
- Upvotes: 1
Selected Answer: D option D
Comment 858876 by knivesz
- Upvotes: 1
Selected Answer: D Correct answer is D
Comment 857997 by XiltroX
- Upvotes: 2
Selected Answer: D Correct answer is D. Arrays are nested datasets in JSON files
Comment 857308 by sguzel
- Upvotes: 1
it should be D