Questions and Answers
Question 8zcDViuerlOgVprgqdt3
Question
A data engineer has created a new database using the following command:
CREATE DATABASE IF NOT EXISTS customer360;
In which of the following locations will the customer360 database be located?
Choices
- A: dbfs:/user/hive/database/customer360
- B: dbfs:/user/hive/warehouse
- C: dbfs:/user/hive/customer360
- D: More information is needed to determine the correct response
- E: dbfs:/user/hive/database
answer?
Answer: B Answer_ET: B Community answer B (72%) D (28%) Discussion
Comment 1049602 by kbaba101
- Upvotes: 10
B B. dbfs:/user/hive/warehouse Thereby showing “dbfs:/user/hive/warehouse/customer360.db
Comment 1415857 by 8221ec5
- Upvotes: 1
Selected Answer: B ans is B 100%. i checked with below statement Describe Schema
Comment 1311787 by Medkalys
- Upvotes: 2
Selected Answer: D D I think you can have different configurations in each env so it doesn’t mean that it will be created in this location dbfs:/user/hive/warehouse
Comment 1255160 by samverma
- Upvotes: 1
D , The database could be created after using use catalog statement . In that case location would be different, not in hive warehouse
Comment 1216205 by aspix82
- Upvotes: 1
B. B is “default”
Comment 1213749 by jskibick
- Upvotes: 3
Selected Answer: D D is correct. We do not know if this is a Unity Catalog enabled database. If so it would be created in default location of catalog as managed table. Therefore too little info to answer.
Comment 1203840 by benni_ale
- Upvotes: 1
Selected Answer: B B is correct
Comment 1158420 by Bob123456
- Upvotes: 1
While usage schema and database is interchangeable, schema is preferred. Option B is correct
Comment 1132193 by UGOTCOOKIES
- Upvotes: 2
Selected Answer: B Creating tables without using the LOCATION keyword to specify a location will create the table (a managed table) in the default directory which is: dbfs:/user/hive/warehouse https://docs.databricks.com/en/dbfs/root-locations.html
Comment 1110008 by Garyn
- Upvotes: 2
Selected Answer: B B. dbfs:/user/hive/warehouse
Explanation:
In Databricks, the default location for databases created in the Hive Metastore is often under the warehouse directory. The CREATE DATABASE command usually creates the metadata entry for the database in the Hive Metastore, but it doesn’t directly create the physical database directory within DBFS (Databricks File System).
The exact path structure may differ based on configuration or settings in the Databricks environment, but generally, the warehouse directory is where Hive databases’ metadata resides. The physical data within the database will be stored in DBFS, but the metadata for the customer360 database should be within the warehouse directory in Hive Metastore.
Comment 1089735 by kz_data
- Upvotes: 1
Selected Answer: B B is correct
Comment 1071971 by Huroye
- Upvotes: 1
correct answer is B. dbfs:/user/hive/warehouse. All managed objects are stored in the default location unless specified.
Comment 1053323 by anandpsg101
- Upvotes: 1
Selected Answer: B b is correct
Comment 1052559 by SD5713
- Upvotes: 2
Selected Answer: B dbfs:/user/hive/warehouse - which is the default location
Comment 1050160 by meow_akk
- Upvotes: 1
Ans A : https://community.databricks.com/t5/data-engineering/database-within-a-database-in-databricks/td-p/20731#:~:text=The%20default%20location%20of%20a,and%20Table%20location%20are%20independent. The default location of a database will be in the /user/hive/warehouse/<databasename. db>. Irrespective of the location of the database the tables in the database can have different locations and they can be specified at the time of creation. Database location and Table location are independent.
Comment 1048865 by kishanu
- Upvotes: 3
Selected Answer: B dbfs:/user/hive/warehouse - which is the default location of any object created
Question yG869SCQfkssJEBqHmFe
Question
A data engineer is attempting to drop a Spark SQL table my_table and runs the following command:
DROP TABLE IF EXISTS my_table;
After running this command, the engineer notices that the data files and metadata files have been deleted from the file system.
Which of the following describes why all of these files were deleted?
Choices
- A: The table was managed
- B: The table’s data was smaller than 10 GB
- C: The table’s data was larger than 10 GB
- D: The table was external
- E: The table did not have a location
answer?
Answer: A Answer_ET: A Community answer A (100%) Discussion
Comment 1203841 by benni_ale
- Upvotes: 1
Selected Answer: A A is correct
Comment 1132195 by UGOTCOOKIES
- Upvotes: 3
Selected Answer: A Two types of tables, managed and external. Both table types are treated the same, except when the table is dropped. For a managed table the data is stored in the managed storage location that is configured to the meta store. By default this is dbfs:/user/hive/warehouse. When the table is dropped the meta data and the underlying data is deleted. For external tables the data is stored in a cloud storage location outside of the managed storage location. The underlying data is retained when an external table is dropped, only the metadata is dropped.
Comment 1110012 by Garyn
- Upvotes: 1
Selected Answer: A A. The table was managed.
Explanation:
In Spark SQL, when a table is managed (or internal), both the metadata that contains information about the table and the actual data files associated with the table are managed by the SQL engine.
The DROP TABLE command, when used on a managed table, deletes not only the metadata but also the underlying data files associated with that table from the file system.
When a managed table is dropped, it removes all information about the table, including metadata and data files, leading to the deletion of both the metadata and data files from the file system.
Options B, C, D, and E don’t specifically relate to why the data files and metadata files were deleted. The fact that the table was managed (or internal) is the reason for the removal of both the metadata and data files when the table was dropped using the DROP TABLE command.
Comment 1089737 by kz_data
- Upvotes: 2
Selected Answer: A A is correct
Comment 1050163 by meow_akk
- Upvotes: 4
A is correct , managed tables files and metadata are managed by metastore and will be deleted when the table is dropped . while external tables the metadata is stored in a external location. hence when a external table is dropped you clear off only the metadata and the files (data) remain.
Question Bs9Kd3bUuNyHmHMbOpAr
Question
A data engineer that is new to using Python needs to create a Python function to add two integers together and return the sum?
Which of the following code blocks can the data engineer use to complete this task?
Choices
- A:
- B:
- C:
- D:
- E:
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1132197 by UGOTCOOKIES
- Upvotes: 1
Selected Answer: D Python functions start with the def keyword followed by the function name. Function also ends with the return keyword.
Comment 1127406 by azure_bimonster
- Upvotes: 1
Selected Answer: D D is to choose here
Comment 1089738 by kz_data
- Upvotes: 2
Selected Answer: D D is correct
Comment 1083474 by 55f31c8
- Upvotes: 3
Selected Answer: D D : https://www.geeksforgeeks.org/python-functions/
Comment 1056557 by Syd
- Upvotes: 2
D is correct. https://www.w3schools.com/python/python_functions.asp
Comment 1050165 by meow_akk
- Upvotes: 3
D is correct. if you get this answer wrong you need to learn the basics of python.
Question yLBcudmi1ZWvcki3fynH
Question
In which of the following scenarios should a data engineer use the MERGE INTO command instead of the INSERT INTO command?
Choices
- A: When the location of the data needs to be changed
- B: When the target table is an external table
- C: When the source table can be deleted
- D: When the target table cannot contain duplicate records
- E: When the source is not a Delta table
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1170501 by fifirifi
- Upvotes: 2
Selected Answer: D correct answer: D explanation: The MERGE INTO command is used when you need to perform both insertions and updates (or deletes) in one operation based on whether a match exists. It is particularly useful for maintaining up-to-date data and ensuring there are no duplicate records in the target table. This is often referred to as an “upsert” operation (update + insert). When the target table needs to be kept free of duplicate records, and there’s a need to update existing records or insert new ones based on some matching condition, MERGE INTO is the appropriate command. The INSERT INTO command, on the other hand, is used to add new records to a table without regard for whether they duplicate existing records. Options A, B, C, and E do not specifically require the use of MERGE INTO. Therefore, D is the correct answer.
Comment 1132198 by UGOTCOOKIES
- Upvotes: 2
Selected Answer: D MERGE INTO you can upsert (update insert) data from a source table, view or dataframe into the target table. Merge operation allows updates, insets and deletes to be completed in a single atomic transaction. The main benefit of using the MERGE INTO is to avoid duplicates but does not inherently remove duplicates.
Comment 1127407 by azure_bimonster
- Upvotes: 2
Selected Answer: D D is answer here
Comment 1089739 by kz_data
- Upvotes: 2
Selected Answer: D D is correct
Comment 1050166 by meow_akk
- Upvotes: 1
Ans D : With merge , you can avoid inserting the duplicate records. The dataset containing the new logs needs to be deduplicated within itself. By the SQL semantics of merge, it matches and deduplicates the new data with the existing data in the table, but if there is duplicate data within the new dataset, it is inserted. https://docs.databricks.com/en/delta/merge.html#:~:text=With%20merge%20%2C%20you%20can%20avoid%20inserting%20the%20duplicate%20records.&text=The%20dataset%20containing%20the%20new,new%20dataset%2C%20it%20is%20inserted.
Question L4VuGHOqoQlyrN9K4msA
Question
A data engineer is working with two tables. Each of these tables is displayed below in its entirety.
//IMG//
The data engineer runs the following query to join these tables together:
//IMG//
Which of the following will be returned by the above query?
Choices
- A:
- B:
- C:
- D:
- E:
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1089741 by kz_data
- Upvotes: 7
Selected Answer: C C is correct
Comment 1267938 by 80370eb
- Upvotes: 1
Selected Answer: C c is correct when performing left join the sales values are only taken.
Comment 1234425 by potaryxkug
- Upvotes: 3
C is correct
Comment 1088571 by nedlo
- Upvotes: 3
Selected Answer: C C is correct answer
Comment 1083489 by 55f31c8
- Upvotes: 3
Selected Answer: C The LEFT JOIN keyword returns all records from the left table (table1), and the matching records from the right table (table2). The result is 0 records from the right side, if there is no match.
Comment 1076163 by Blacknight99
- Upvotes: 2
Selected Answer: C C is the correct answer
Comment 1071977 by Huroye
- Upvotes: 3
The answer is C. this is a Left Join. In this case you show everything on the left side regardless of whether they appear on the right. When it does not appear on the right you represent that with a Null. So, for a3, store id is null.
Comment 1067336 by dev_soumya369
- Upvotes: 2
C is the correct answer. In a LEFT JOIN, all the records from the left table are included, and only the matching records from the right table are added. In this case, “a1” and “a4” from the left table (favorite_stores) match with “a1” and “a4” from the right table (sales). So, these matching records are fetched. Additionally, all the records from the left table, including “a3,” are included. Since “a3” has no corresponding store_id in the right table, the store_id for “a3” will be NULL. Therefore, after the LEFT JOIN, the result will include “a1,” “a3” (with a NULL store_id), and “a4.”
Comment 1064822 by mokrani
- Upvotes: 1
C is correct. please refer to this simple blog if any confusion regarding JOINS https://sql.sh/cours/jointures
Comment 1053325 by anandpsg101
- Upvotes: 2
Selected Answer: C c is corret
Comment 1050167 by meow_akk
- Upvotes: 1
Ans C: Left join only keeps left recs and only the matching recs from Right table. in other words : the left table is preserved as is.
Comment 1048890 by kishanu
- Upvotes: 1
Selected Answer: C A typical LEFT JOIN scenario
Comment 1048436 by [Removed]
- Upvotes: 3
Selected Answer: C The LEFT JOIN keyword returns all records from the left table, even if there are no matches in the right table.