Data Science Questions and Answers

Data Science Questions and Answers – Basics of Data Science

Upload your resume to check Data Science Score (1-10) for job placements Upload Resume

1. Which one is a superset which includes the rest in the following

  • a). Data Analysis
  • b). Data Science
  • c). Business Analytics
  • d). Model creation
Answer: b

Explanation: Data Science is a multidisciplinary which involves Maths, Statistics, Data Analysis and Model creation.

2. Point out the correct statement.

  • a). Raw data is original source of data
  • b). Preprocessed data is original source of data
  • c). Raw data is the data obtained after processing steps
  • d). None of the mentioned
Answer: a

Explanation: Raw data as gathered by Web Scrapping /API can have null/imbalanced data which needs to be cleaned .

3. Which of the following is performed by Data Scientist?

  • a). Define the problem statement
  • b). Create a statistical model
  • c). Analyse the model
  • d). All of the mentioned
Answer: d

4. Which of the following is the most important language for Data Science?

  • a). Java
  • b). Ruby
  • c). R or Python
  • d). None of the mentioned
Answer: c

Explanation: R and Python is free software for statistical computing and analysis and have all statistical libraries.

5. Point out the wrong statement.

  • a). Merging concerns combining datasets on the same observations to produce a result with more variables
  • b). Data visualization is the organization of information according to preset specifications
  • c). Subsetting can be used to select and exclude variables and observations
  • d). All of the mentioned
Answer: b

Explanation: Data formatting is the organization of information according to preset specifications.

6. Which of the following approach should be used to ask Data Analysis question?

  • a). Find only one solution for particular problem
  • b). Find out the question which is to be answered
  • c). Find out answer from dataset without asking question
  • d). None of the mentioned
Answer: b

Explanation: Data analysis has multiple facets and approaches.

7. Which of the following is one of the key data science skill

  • a). Statistics
  • b). Machine Learning
  • c). Data Visualization
  • d). All of the mentioned
Answer: d

8. Which of the following is characteristic of Processed Data?

  • a). Data is not ready for analysis
  • b). All steps should be noted
  • c). Hard to use for data analysis
  • d). None of the mentioned
Answer: b

Explanation: Processing includes merging, summarizing and subsetting data.

9.Raw data should be processed only one time.

  • a). True
  • b). False
Answer: b

Explanation: Raw data can be cleaned multiple times depending on model accuracy

10.Point out the correct statement.

  • a). Least square is an estimation tool
  • b). Least square problems falls in to three categories
  • c). Compound least square is one of the category of least square
  • d). None of the mentioned
Answer: a

Explanation: The Method of Least Squares is a procedure to determine the best fit line to data.

11. One of the following is not the role of Data Engineer

  • a). Data gathering
  • b). Data Warehousing
  • c). Feature Engineering & Selection
  • d). None of the mentioned
Answer: c

Explanation: Feature Engineering and Selection is done by Data Analyst

12. Which of the following is not a step in data analysis?

  • a). Obtain the data
  • b). Clean the data
  • c). EDA
  • d). None of the mentioned
Answer: d

Explanation: EDA stands for Exploratory Data Analysis.

13. Point out the wrong statement.

  • a). Simple linear regression is equipped to handle more than one predictor
  • b). Compound linear regression is not equipped to handle more than one predictor
  • c). Linear regression consists of finding the best-fitting straight line through the points
  • d). All of the mentioned
Answer: a

Explanation: Simple linear regression is equipped to handle more than one predictor.

14. Which of the following technique comes under practical machine learning?

  • a). Bagging
  • b). Boosting
  • c). Forecasting
  • d). None of the mentioned
Answer: b

Explanation: Boosting is an approach to machine learning based on the idea of creating a highly accurate predictor.

15.Which of the following technique is also referred to as Bagging?

  • a). Bootstrap aggregating
  • b). Bootstrap subsetting
  • c). Bootstrap predicting
  • d). All of the mentioned
Answer: a

Explanation: Bagging is used in statistical classification and regression.

16.Which of the following is characteristic of Raw Data?

  • a). Data is ready for analysis
  • b). Original version of data
  • c). Easy to use for data analysis
  • d). None of the mentioned
Answer: b

Explanation: Raw data is data that has not been processed for use.

17. Which software is used for visualization.

  • a). Tableau
  • b). Power BI
  • c). Python /R
  • d). All the Above
Answer: a

18. Which mathematical/statistical function is widely used for Model Optimisation?

  • a). Linear Regression
  • b). Linear Algebra
  • c). Calculus
  • d). Probablity
Answer: a

19. Python can be used for

  • a). Web Scrapping
  • b). ML /DL Algorithms Models
  • c). Web sites development
  • d). All the above
Answer: d