Questions and Answers

Question K8DWLXBn7vvwMZSuEz8N

Question

A data architect has determined that a table of the following format is necessary:

//IMG//

Which of the following code blocks uses SQL DDL commands to create an empty Delta table in the above format regardless of whether a table already exists with this name?

Choices

answer?

Answer: E Answer_ET: E Community answer E (86%) 14% Discussion

Comment 1050141 by meow_akk

Upvotes: 6

E is correct you dont need to specify Delta as its the default storage format for tables.

Comment 1132510 by Stemix

Upvotes: 5

A and E have both correct syntax, but the question mentioned “regardless of whether a table already exists with this name”. Hence the correct answer is E

Comment 1301418 by RandomForest

Upvotes: 1

Selected Answer: E E is correct because you dont need to specify Delta as its the default storage format for tables and the question states “regardless of whether the table already exists”. As a result A is wrong as it will NOT recreate the table if it exists.

Comment 1273390 by 9d4d68a

Upvotes: 1

The closest correct option could be C but it shouldn’t have with column clause To create an empty Delta table with the specified schema in SQL, you can use the CREATE TABLE statement with the USING DELTA clause. The IF NOT EXISTS option ensures that the table is created only if it does not already exist. Here’s the SQL code block that accomplishes this:

CREATE TABLE IF NOT EXISTS your_table_name ( employeeld STRING, startDate DATE, avgRating DOUBLE ) USING DELTA;

Comment 1263475 by 80370eb

Upvotes: 1

Selected Answer: A A is the correct answer. create table if not exists will check if the table already exists in database. if not it will create the new table.

Comment 1203829 by benni_ale

Upvotes: 1

Selected Answer: E E is correct

Comment 1127384 by azure_bimonster

Upvotes: 2

Selected Answer: E E is correct option

Comment 1117397 by bartfto

Upvotes: 2

Selected Answer: E E. correct

Question K4c2qewORCPg6klLcuze

Question

A data engineer has a Python notebook in Databricks, but they need to use SQL to accomplish a specific task within a cell. They still want all of the other cells to use Python without making any changes to those cells.

Which of the following describes how the data engineer can use SQL within a cell of their Python notebook?

Choices

A: It is not possible to use SQL in a Python notebook
B: They can attach the cell to a SQL endpoint rather than a Databricks cluster
C: They can simply write SQL syntax in the cell
D: They can add %sql to the first line of the cell
E: They can change the default language of the notebook to SQL

answer?

Answer: D Answer_ET: D Community answer D (100%) Discussion

Comment 1263480 by 80370eb

Upvotes: 1

Selected Answer: D we can use magic comment in notebook to indicate the cell to be run in specific language %SQL.

Comment 1127386 by azure_bimonster

Upvotes: 1

Selected Answer: D Magic command % can be used to switch the language, so D is correct

Comment 1117398 by bartfto

Upvotes: 1

Selected Answer: D D. Correct. Use %sql magic in first line.

Comment 1078561 by Lavpak

Upvotes: 3

Selected Answer: D Use magic command %sql

Comment 1057563 by MFEST

Upvotes: 3

correct answer D

Question OmuUP1e2bQReHd3lMgId

Question

Which of the following SQL keywords can be used to convert a table from a long format to a wide format?

Choices

A: TRANSFORM
B: PIVOT
C: SUM
D: CONVERT
E: WHERE

answer?

Answer: B Answer_ET: B Community answer B (100%) Discussion

Comment 1263485 by 80370eb

Upvotes: 1

Selected Answer: B The PIVOT operation is used to rotate data from rows to columns, which effectively converts a table from a long format (where each row represents a single observation or measurement) to a wide format (where each row represents a single entity with multiple observations or measurements as columns).

Comment 1203830 by benni_ale

Upvotes: 1

Selected Answer: B B is correct

Comment 1127390 by azure_bimonster

Upvotes: 1

Selected Answer: B Answer is B

Comment 1101148 by AndreFR

Upvotes: 2

Selected Answer: B https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-qry-select-pivot.html

“Pivot” transforms the rows of the table_reference by rotating unique values of a specified column list into separate columns.

SYNTAX :

table_reference PIVOT ( { aggregate_expression [ [ AS ] agg_column_alias ] } [, …] FOR column_list IN ( expression_list ) )

column_list { column_name | ( column_name [, …] ) }

expression_list { expression [ AS ] [ column_alias ] | { ( expression [, …] ) [ AS ] [ column_alias] } [, …] ) }

Comment 1052939 by athu07

Upvotes: 2

Selected Answer: B PIVOT is correct.

Comment 1050139 by meow_akk

Upvotes: 3

PIVOT is correct.

Question JJoRkcfWTPzHbGJYAyL7

Question

Which of the following describes a benefit of creating an external table from Parquet rather than CSV when using a CREATE TABLE AS SELECT statement?

Choices

A: Parquet files can be partitioned
B: CREATE TABLE AS SELECT statements cannot be used on files
C: Parquet files have a well-defined schema
D: Parquet files have the ability to be optimized
E: Parquet files will become Delta tables

answer?

Answer: C Answer_ET: C Community answer C (88%) 6% Discussion

Comment 1247403 by 1a44567

Upvotes: 1

Vote for D Parquet files are a columnar storage file format that allows for efficient data compression and encoding schemes, enabling optimization and faster query performance compared to CSV files. This format supports efficient reading and writing of large datasets, making it a preferred choice for big data applications.

Comment 1219456 by MDWPartners

Upvotes: 1

Selected Answer: C The keywords are “CREATE TABLE AS SELECT “

Comment 1203832 by benni_ale

Upvotes: 1

Selected Answer: C C is correct

Comment 1132190 by UGOTCOOKIES

Upvotes: 2

Selected Answer: C CREATE TABLE AS SELECT adopts the schema details from the source. Parquet files have a defined schema.

Comment 1117399 by bartfto

Upvotes: 1

Selected Answer: C C. Paruqet has well defined schema unline csv

Comment 1109975 by Garyn

Upvotes: 1

Selected Answer: C C. Parquet files have a well-defined schema.

Explanation:

Parquet files inherently store metadata about the schema within the files themselves, allowing for a well-defined schema. This schema information includes data types, column names, and other structural information. When creating an external table from Parquet, this schema is retained, providing a structured and well-defined format for the data. It ensures consistency and enables more efficient processing, query optimization, and compatibility across various systems or tools that work with the Parquet format. This structured schema within Parquet files offers advantages in terms of data integrity, ease of data processing, and compatibility, making it a beneficial choice over CSV, which lacks inherent schema information and might need additional handling or inference of schema during data ingestion.

Comment 1101159 by AndreFR

Upvotes: 1

Selected Answer: B The key word here is : CREATE TABLE AS SELECT

not A : partitioning is not relevant in a create table as statement because the data will be created in a delta table not C : Parquet schema is not well defined and there can be parquet files with multiple schema in a folder not D : Parquet are already optimized and are not relevant in a create table as statement because the data will be created in a delta table not E : both CSV & Parquet will become delta tables in a create table as statement B : correct answer by elimination

Comment 1094895 by nedlo

Upvotes: 1

Selected Answer: D I disagree i think its D. Schema can be inferred from CSV as well, but CSV cannot provide same optimizations as Parquet

Comment 1058405 by FastEddie

Upvotes: 4

Selected Answer: C CTAS - CTAS automatically infer schema information from query results and do not support manual schema declaration.This means that CTAS statements are useful for external data ingestion from sources with well-defined schema, such as Parquet files and tables.CTAS statements also do not support specifying additional file options.

Comment 1057667 by kishore1980

Upvotes: 2

Selected Answer: C C is the correct option

Comment 1053321 by anandpsg101

Upvotes: 2

Selected Answer: C c is correct

Comment 1050150 by meow_akk

Upvotes: 4

Ans : C https://www.databricks.com/glossary/what-is-parquet#:~:text=Columnar%20storage%20like%20Apache%20Parquet,compared%20to%20row%2Doriented%20databases.

Columnar storage like Apache Parquet is designed to bring efficiency compared to row-based files like CSV. When querying, columnar storage you can skip over the non-relevant data very quickly. As a result, aggregation queries are less time-consuming compared to row-oriented databases.

Comment 1049562 by kbaba101

Upvotes: 4

C. it supports well-defined schema, such as Parquet files and tables and do not support specifying additional file options such as Delimeter if you were to use CSV

Question uS5NDHw9HyR980QFix1C

Question

A data engineer wants to create a relational object by pulling data from two tables. The relational object does not need to be used by other data engineers in other sessions. In order to save on storage costs, the data engineer wants to avoid copying and storing physical data.

Which of the following relational objects should the data engineer create?

Choices

A: Spark SQL Table
B: View
C: Database
D: Temporary view
E: Delta Table

answer?

Answer: D Answer_ET: D Community answer D (100%) Discussion

Comment 1050152 by meow_akk

Upvotes: 8

D is correct. Temp view : session based Create temp view view_name as query All these are termed as session ended: Opening a new notebook Detaching and reattaching a cluster Installing a python package Restarting a cluster

Comment 1263495 by 80370eb

Upvotes: 1

Selected Answer: D A temporary view is created in memory and does not persist beyond the session. It does not require physical storage, making it ideal for avoiding storage costs.

Comment 1109989 by Garyn

Upvotes: 2

Selected Answer: D D. Temporary view

Explanation:

Temporary View: A temporary view in database systems like Apache Spark provides a temporary and ephemeral representation of data based on an SQL query’s result set. It exists for the duration of a Spark session and is not persisted to storage. Similar to a regular view, a temporary view allows the data engineer to define a logical schema by pulling and combining data from multiple tables using SQL queries, but it does not store any physical data on disk. Temporary views are suitable when there’s no need for long-term storage of the combined data and are helpful for immediate analysis or processing within the current session without incurring storage costs.

Comment 1101164 by AndreFR

Upvotes: 1

Selected Answer: D https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-view.html

Should be a view or temporary view to avoid copying and storing data.

Does not need to be used by other data engineers in other sessions. So this view should be temporary. TEMPORARY views are visible only to the session that created them and are dropped when the session ends.

Comment 1071964 by Huroye

Upvotes: 2

Answer is D. key phrase is “…does not need to be used by other data engineers in other sessions…”

Comment 1057676 by kishore1980

Upvotes: 1

Selected Answer: D D is right option

vuthanhdatt's Second Brain

Explorer

25

Questions and Answers

Question K8DWLXBn7vvwMZSuEz8N

Question

Choices

Comment 1050141 by meow_akk

Comment 1132510 by Stemix

Comment 1301418 by RandomForest

Comment 1273390 by 9d4d68a

Comment 1263475 by 80370eb

Comment 1203829 by benni_ale

Comment 1127384 by azure_bimonster

Comment 1117397 by bartfto

Question K4c2qewORCPg6klLcuze

Question

Choices

Comment 1263480 by 80370eb

Comment 1127386 by azure_bimonster

Comment 1117398 by bartfto

Comment 1078561 by Lavpak

Comment 1057563 by MFEST

Question OmuUP1e2bQReHd3lMgId

Question

Choices

Comment 1263485 by 80370eb

Comment 1203830 by benni_ale

Comment 1127390 by azure_bimonster

Comment 1101148 by AndreFR

Comment 1052939 by athu07

Comment 1050139 by meow_akk

Question JJoRkcfWTPzHbGJYAyL7

Question

Choices

Comment 1247403 by 1a44567

Comment 1219456 by MDWPartners

Comment 1203832 by benni_ale

Comment 1132190 by UGOTCOOKIES

Comment 1117399 by bartfto

Comment 1109975 by Garyn

Comment 1101159 by AndreFR

Comment 1094895 by nedlo

Comment 1058405 by FastEddie

Comment 1057667 by kishore1980

Comment 1053321 by anandpsg101

Comment 1050150 by meow_akk

Comment 1049562 by kbaba101

Question uS5NDHw9HyR980QFix1C

Question

Choices

Comment 1050152 by meow_akk

Comment 1263495 by 80370eb

Comment 1109989 by Garyn

Comment 1101164 by AndreFR

Comment 1071964 by Huroye

Comment 1057676 by kishore1980

Graph View

Table of Contents