Sample Chapter

INSTANT DOWNLOAD COMPLETE TEST BANK WITH ANSWERS

 

Test Bank For Business Forecasting 6th Edition by Wilson

 

 

SAMPLE QUESTIONS

 

 

Chapter 1

 

MULTIPLE CHOICE TEST BANK

 

Note:  The correct answer is denoted by  **.

 

  1. Which of the following does not require sophisticated quantitative forecasts?

 

  1. A) Accounting revenue forecasts for tax purposes.
  2. B) Money managers use of interest rate forecasts for asset allocation decisions.
  3. C) Managers of power plants using weather forecasts in forecasting power demand.
  4. D) State highway planners require peak load forecasts for planning purposes.
  5. E) All the above require quantitative forecasts.   **

 

  1. Under what circumstances may it make sense not to prepare a business forecast?

 

  1. A) No data is readily available.
  2. B) The future will be no different from the past.  **
  3. C) The forecast horizon is 40 years.
  4. D) There is no consensus among informed individuals.
  5. E) The industry to forecast is undergoing dramatic change.

 

  1. What is most likely to be the major difference between forecasting sales of a private business versus forecasting the demand of a public good supplied by a governmental agency?

 

  1. A) Amount of data available.
  2. B) Underlying economic relationships.
  3. C) Lack of market-determined price data for public goods.  **
  4. D) Last of historical data.
  5. E) Lack of quantitative ability by government forecasters.

 

  1. Which of the following points about supply chain management is incorrect?

 

  1. A) Forecasts are required at each step in the supply chain.
  2. B) Forecasts of sales are required for partners in the supply chain.
  3. C) Collaborative forecasting systems across the supply chain are needed.
  4. D) If you get the forecast right, you have the potential to get everything else right in the supply chain.
  5. E) None of the above.  **

 

  1. Which of the following is not typically part of the traditional forecasting textbook?

 

  1. A) Classical statistics applied to business forecasting.
  2. B) Use of computationally intensive forecasting software.  **
  3. C) Attention to simplifying assumptions about the data.
  4. D) Discussion of probability distributions.
  5. E) Attention to statistical inference.

 

  1. Which subjective forecasting method depends upon the anonymous opinion of a panel of individuals to generate sales forecasts?

 

  1. A) Sales Force Composites.
  2. B) Customer Surveys.
  3. C) Jury of Executive Opinion.
  4. D) Delphi  **
  5. E) None of the above.

 

  1. Which subjective sales forecasting method may have the most information about the spending plans of customers for a specific firm?

 

  1. A) Sales Force Composites.  **
  2. B) Index of consumer sentiment.
  3. C) Jury of Executive Opinion.
  4. D) Delphi
  5. E) None of the above.

 

  1. Which subjective sales forecasting technique may have problems with individuals who have a dominant personality?

 

  1. A) Sales Force Composites.
  2. B) Customer Surveys.
  3. C) Jury of Executive Opinion.  **
  4. D) Delphi
  5. None of the above.

 

  1. Which of the following methods is not useful for forecasting sales of a new product?

 

  1. A) Time series techniques requiring lots of historical data.  **
  2. B) Delphi
  3. C) Consumer Surveys.
  4. D) Test market results.
  5. E) All the above are correct.

 

  1. Which of the following is not considered a subjective forecasting method?

 

  1. A) Sales force composites.
  2. B) Naive methods.  **
  3. C) Delphi methods.
  4. D) Juries of executive opinion.
  5. E) Consumer surveys.

 

  1. Which of the following is not an argument for the use of subjective forecasting models?

 

  1. A) They are easy for management to understand.
  2. B) They are quite useful for long-range forecasts.
  3. C) They provide valuable information that may not be present in quantitative models.
  4. D) They are useful when data for using quantitative models is extremely limited.
  5. E) None of the above.  **

 

  1. Forecasts based solely on the most recent observation(s) of the variable of interest

 

  1. A) are called “naive” forecasts.
  2. B) are the simplest of all quantitative forecasting methods.
  3. C) leads to loss of one data point in the forecast series relative to the original series.
  4. D) are consistent with the “random walk” hypothesis in finance, which states that the optimal forecast of today’s stock rate of return is yesterday’s actual rate of return.
  5. E) All the above.  **

 

  1. You are given a time series of sales data with 10 observations. You construct forecasts according to last period’s actual level of sales plus the most recent observed change in sales.  How many data points will be lost in the forecast process relative to the original data series?

 

  1. A)
  2. B)   **
  3. C)
  4. D)
  5. E) None of the above.

 

  1. Suppose you are attempting to forecast a variable that is independent over time such as stock rates of return. A potential candidate-forecasting model is

 

  1. A) The Jury of Executive Opinion.
  2. B) Last period’s actual rate of return.  **
  3. C) The Delphi
  4. D) Last period’s actual rate of return plus some proportion of the most recently observed rate of change in the series.
  5. E) None of the above.

 

  1. Measures of forecast accuracy based upon a quadratic error cost function, notably root mean square error (RMSE), tend to treat

 

  1. A) levels of large and small forecast errors equally.
  2. B) large and small forecast errors equally on the margin.
  3. C) large and small forecast errors unequally on the margin.  **
  4. D) every forecast error with the same penalty.
  5. E) None of the above.

 

  1. Which of the following is incorrect? Evaluation of forecast accuracy

 

  1. A) is important since the production of forecasts is costly to the firm.
  2. B) requires the use of symmetric error cost functions.
  3. is important since it may reduce business losses from inaccurate forecasts.
  4. D) is done by averaging forecast errors.
  5. E) both b) and d) are incorrect.  **
  6. F) both a) and b) are incorrect.

 

  1. Which of the following measures of forecast accuracy can be used to compare “goodness of fit” across different sized variables?

 

  1. A) Mean Absolute Error.
  2. B) Mean Absolute Percentage Error.  **
  3. C) Mean Squared Error.
  4. D) Root Mean Squared Error.
  5. E) None of the above.

 

  1. Which of the following measures is a poor indicator of forecast accuracy, but useful in determining the direction of bias in a forecasting model?

 

  1. A) Mean Absolute Percentage Error.
  2. B) Mean Percentage Error.  **
  3. C) Mean Squared Error.
  4. D) Root Mean Squared Error.
  5. E) None of the above.

 

  1. Which measure of forecast accuracy is analogous to standard deviation?

 

  1. A) Mean Absolute Error.
  2. B) Mean Absolute Percentage Error.
  3. C) Mean Squared Error.
  4. D) Root Mean Squared Error.  **.

 

  1. Which of the following measures of forecast performance are used to compare models for a given data series?

 

  1. A) Mean Error.
  2. B) Mean Absolute Error.
  3. C) Mean Squared Error.
  4. D) Root Mean Squared Error.
  5. E) All of the above.  **

 

  1. What values of Theil’s U statistic are indicative of an improvement in forecast accuracy relative to the no-change naive model?

 

  1. A) U < 0.
  2. B) U = 0.
  3. C) U < 1.  **
  4. D) U > 1.
  5. E) None of the above.

 

  1. RMSE applied to the analysis of model forecast errors, treats

 

  1. A) levels of large and small forecast errors equally.
  2. B) large and small forecast errors equally on the margin.
  3. C) large and small forecast errors unequally on the margin.  **
  4. D) every forecast error with the same penalty.

 

  1. Because of different units of various data series, which accuracy statistic can be used across different series?

 

  1. A)
  2. B)
  3. C)   **
  4. D) MAE
  5. E) None of the above.

 

  1. Some helpful hints on judging forecast accuracy include:

 

  1. A) Be wary when the forecast outcome is not independent of the forecaster.
  2. B) Do not judge model adequacy based on large one-time errors.
  3. C) Do not placed unwarranted faith in computer-based forecasts.
  4. D) Keep in mind what exactly you are trying to forecast.
  5. E) All of the above.  **

 

  1. Which of the following is not an appropriate use of forecast errors to access the accuracy of a particular forecasting model?

 

  1. A) Examine a time series plot of the errors and look for a random pattern.
  2. B) Examine the average absolute value of the errors.
  3. C) Examine the average squared value of the errors.
  4. D) Examine the average level of the errors.  **
  5. E) None of the above.

 

  1. Which of the following forecasting methods requires use of large and extensive data sets?

 

  1. A) Naive methods.
  2. B) Exponential smoothing methods.
  3. C) Multiple regression.  **
  4. D) Delphi
  5. E) None of the above.

 

  1. When using quarterly data to forecast domestic car sales, how can the simple naive forecasting model be amended to model seasonal behavior of new car sales, i.e., patterns of sales that arise at the same time every year?

 

  1. A) Forecast next period’s sales based on this period’s sales.
  2. B) Forecast next period’s sales based on last period’s sales.
  3. C) Forecast next period’s sales based on the average sales over the current and last three quarters.
  4. D) Forecast next period’s sales based on sales four quarters ago.  **
  5. E) None of the above.

 

Chapter Three

 

Multiple Choice

Identify the choice that best completes the statement or answers the question.

 

____    1.   What factors do the five data smoothing techniques presented in Chapter Three have in common?

a. They all use only past observations of the data.
b. They all fail to forecast cyclical reversals in the data.
c. They all smooth short-term noise by averaging data.
d. They all product serially correlated forecasts.
e. All the above.

 

 

____    2.   Time series smoothing techniques work best for applications where

a. little historical data are available to the forecaster.
b. there is a large amount of historical data available.
c. the forecast horizon is the distant future.
d. only periodic forecasts for untimely events are required.
e. All the above.

 

 

____    3.   Time-series smoothing techniques attempt to

a. suppress short-term variability in the data.
b. identify long-term trends or cycles in the data.
c. remove seasonality in the data.
d. suppress data noise while extracting trend.
e. All the above.

 

 

____    4.   A simple-centered 3-point moving average of the time-series variable Xt is given by:

a. (Xt-1 + Xt-2 + Xt-3)/3.
b. (Xt + Xt-1 + Xt-2)/3.
c. (Xt+1 + Xt + Xt-1)/3.
d. None of the above.

 

 

____    5.   Which of the following is not a problem with moving-average forecasting?

a. It produces serially correlated forecasts.
b. It removes short-term variability by averaging nearby data.
c. It cannot predict reversals in trends.
d. It cannot model non-stationary data.
e. All the above.

 

 

____    6.   With which type of time-series data should moving-average smoothing methods produce the best forecasts?

a. Seasonal.
b. Stationary.
c. Trending.
d. Cyclical.
e. All the above.

 

 

____    7.   In using moving-average smoothing to generate forecasts, a three-month moving average will be preferred to a six-month moving average

a. if the true data cycle is three months.
b. if it has a lower RMSE.
c. if it has a lower mean-squared error.
d. if we have very little data to work with.
e. All the above.

 

 

____    8.   Moving-average smoothing may lead to misleading inference when applied to

a. stationary data.
b. forecasting trend reversal in the stock market.
c. small and limited data sets.
d. large and plentiful data sets.
e. None of the above.

 

 

____    9.   Which method uses an arithmetic mean to forecast the next period?

a. Naive.
b. Moving averages.
c. Exponential smoothing.
d. Adaptive filtering.
e. None of the above

 

 

____  10.   Some drawbacks to using centered moving-average smoothing models include:

a. loss of data at each end of the original time series.
b. introduction of autocorrelation into the forecasts.
c. inability to forecast turning points in the data.
d. All the above.

 

 

____  11.   Which forecasting model assumes that the pattern exhibited by historical data can best be represented by an arithmetic average of nearby observations?

a. Simple exponential smoothing.
b. Naive methods.
c. Moving average smoothing.
d. Holt’s smoothing.
e. None of the above.

 

 

____  12.   Which method is used to develop a simple model that assumes that weighted averages of recent periods are the best predictors of the future?

a. Naive.
b. Moving averages.
c. Exponential smoothing.
d. Naïve model squared.
e. None of the above

 

 

____  13.   Simple-exponential smoothing models are useful for data, which have

a. a downward time trend.
b. an upward time trend.
c. neither an upward or downward time trend.
d. pronounced seasonality.
e. All the above.

 

 

____  14.   Simple exponential smoothing models differ from moving average models in that

a. moving average models use weighted averages of the data whereas simple exponential smoothing models use simple averages.
b. simple exponential smoothing models use weighted averages of the data whereas moving average models use simple averages.

 

 

____  15.   Which of the following is a factor in the decision to use exponential smoothing rather than moving-average smoothing to forecast a given time series?

a. Amount of data available.
b. Importance of recent past versus distant past.
c. Forecast horizon.
d. Expertise of the forecast manager.
e. None of the above.

 

 

____  16.   The term ‘exponential’ in the exponential smoothing method refers to

a. weights on past data that increase exponentially into the past.
b. weights on past data that decrease exponentially into the past.
c. calculation uses a weighted average.
d. using a non-weighted polynomial on past data.
e. None of the above.

 

 

____  17.   Which of the following is not correct concerning choosing the appropriate size of the smoothing constant (a or alpha) in the simple exponential smoothing model?

a. Select values close to zero if the series has a great deal of random variation.
b. Select values close to one if you wish the forecast values to depend strongly on recent changes in the actual values.
c. Select a value that minimizes RMSE.
d. Select a value that maximizes mean-squared error.
e. All the above.

 

 

____  18.   The simple exponential smoothing model can be expressed as

a. a simple average of past values of the data.
b. an expression combining the most recent forecast and actual data value.
c. a weighted average, where the weights sum to zero.
d. a weighted average, where the weights sum to the sample size.
e. None of the above.

 

 

____  19.   The same benefits/criticisms apply to moving average and exponential smoothing with the exception of

a. amount of data required.
b. ease of calculation.
c. ability to model trend.
d. ability to forecast cyclical reversals.
e. None of the above.

 

 

____  20.   Choosing the appropriate size of the smoothing constant (a) in the simple exponential smoothing model

a. is equivalent to asking, “How much weight should be given in revising our forecast for next period to this period’s forecast error?”
b. can best be determined by subjective means.
c. is simple if the data are stationary, since a should be zero.
d. is simple if the data are nonstationary, since a should be one.

 

 

____  21.   The smoothing constant in the exponential smoothing model

a. completely determines the weight structure in exponential smoothing.
b. can be interpreted as the revision of this period’s forecast to today’s forecast error.
c. cannot be equal to 0 or 1.
d. must lie between 0 and 1.
e. All the above.

 

 

____  22.   Which of the following is not a major problem with exponential smoothing?

a. It requires a large amount of data and time to generate forecasts.
b. It requires that the forecaster choose, on some basis, the smoothing constant.
c. It produces forecasts that are serially correlated.
d. If employs only past data in making forecasts of the future.
e. All the above.

 

 

____  23.   Which of the following is not considered a smoothing model?

a. Naive.
b. Moving averages.
c. Exponential smoothing.
d. Adaptive-Response-Rate Single Exponential Smoothing.
e. None of the above.

 

 

Simple Smoothing

 

Note:  The next three questions relate to the following data:

 

Time Period Actual Series Forecast Series Forecast Error
1 100 100 0
2 110
3 115

 

 

____  24.   If a smoothing constant of .3 is used, what is the exponentially smoothed forecast for period 4?

a. 106.6.
b. 103.0.
c. 115.0.
d. 112.6.
e. 104.4.

 

 

____  25.   What is the forecast error for period 3?

a. -3.
b. -12.
c. -10.
d. -7.
e. +7.

 

 

____  26.   If a three-month moving-average model is used, what is the forecast for period 4?

a. 104.4.
b. 106.6.
c. 107.1.
d. 108.3.
e. 110.2.

 

 

____  27.   If the smoothing constant were chosen to be unity, the exponential smoothing model would equal

a. moving average smoothing.
b. Holt’s exponential smoothing.
c. the simple naive model.
d. Winter’s exponential smoothing.
e. moving average smoothing with a one-year lag.

 

 

____  28.   What do moving-average smoothing and exponential smoothing have in common?

a. They both require only a limited amount of data.
b. They both are simple to use.
c. They both are simple to understand.
d. They both have no ability to adjust for trend in the data.
e. All of the above.

 

 

____  29.   The smoothing constant (a) of the simple exponential smoothing model

a. should have a value close to one if the underlying data is relatively erratic.
b. should have a value close to zero if the underlying data is relatively smooth.
c. is closer to zero, the greater the revision in the current forecast given the current forecast error.
d. is closer to one, the greater the revision in the current forecast given the current forecast error.

 

 

____  30.   In the Holt’s two-parameter smoothing model, the trend smoothing parameter Gamma

a. should be close to one when the data has a relatively smooth trend.
b. should be close to zero when the data has a relatively smooth trend.
c. should be close to one when a is close to one.
d. should be close to one when a is one.

 

 

____  31.   Holt’s forecasted values

a. contain no estimate of trend in the underlying series.
b. are superior when the underlying data has pronounced seasonality.
c. for periods into the future lie along a straight line.
d. are simple centered moving averages.
e. None of the above.

 

 

____  32.   The Holt’s forecasting model uses:

a. Naive methods.
b. Moving averages.
c. Exponential smoothing.
d. Adaptive filtering.
e. None of the above.

 

 

____  33.   Holt’s smoothing is best applied to data that are

a. nonseasonal.
b. nonstationary.
c. deseasonalized with a trend.
d. nonstationary and nonseasonal.
e. All the above.

 

 

____  34.   Holt’s model accounts for any growth factor present in a time series by

a. use of a linear trend.
b. smoothing the most recent trend by last period’s smoothed trend.
c. adding trend estimates to level forecasts.
d. using simple exponential smoothing to estimate a trend factor that is then combined in a linear fashion with the level forecast.
e. All the above.

 

 

____  35.   Winter’s exponential smoothing

a. is appropriate for data with both trend and seasonal components.
b. models account for seasonality in a multiplicative manner.
c. models have three smoothing parameters.
d. model use only past observations of a time series.
e. All of the above.

 

 

____  36.   Which of the following is not an aspect of the Winter’s exponential smoothing model?

a. Holt’s model extended to deseasonalized data.
b. Simple exponential smoothing applied to nonstationary data.
c. Seasonality estimates that are themselves smoothed.
d. Trend estimates that are themselves smoothed.
e. All of the above.

 

 

____  37.   If the time series of interest is highly random, the seasonal smoothing constant (Beta) of the Winter’s model should be set

a. equal to zero.
b. at a small positive value.
c. at a large positive value, but less than unity.
d. at unity.
e. None of the above.

 

 

____  38.   How many parameters must the forecaster (or the software) set using Winter’s exponential smoothing?

a. 0.
b. 1.
c. 2.
d. 3.
e. None of the above.

 

 

____  39.   In the Adaptive-Response-Rate Single Exponential Smoothing model, the smoothing parameter

a. is not a constant.
b. varies from period to period.
c. is determined by the ratio of the absolute value of the smoothed error divided by the absolute smoothed error.
d. is the ratio of two smoothed error measures.
e. All of the above.

 

 

____  40.   The Adaptive-Response-Rate Single Exponential Smoothing model is termed adaptive because

a. it responds to changes in the pattern of data.
b. the smoothing parameter changes each period.
c. it has the ability to model changes in the mean of time series.
d. it can virtually take care of itself in generating forecasts.
e. All of the above.

 

 

____  41.   The Adaptive-Response-Rate Single Exponential Smoothing model can be amended to handle seasonal data by

a. first deseasonalizing, then reseasonalizing the data.
b. deseasonalizing the data.
c. reseasonalizing the data.
d. smoothing the data trend first.
e. None of the above.

 

 

____  42.

 

The simple equation above represents

a. a Logistics function
b. a Croston intermittent function
c. a Probit function
d. a Gompertz function

 

 

____  43.

 

The simple equation above represents

a. a Logistics function
b. a Croston intermittent function
c. a Probit function
d. a Gompertz function

 

 

____  44.   Growth models like those used in ForecastX usually model situations well where a process grows

a. at a more or less constant rate
b. until reaching saturation
c. in a linear fashion
d. at an exponential rate

 

 

____  45.   The growth models used in ForecastX are sometimes called

a. exponential models
b. smoothing models
c. event models
d. diffusion models

 

 

____  46.   The “L” independent variable in the growth models we examined represents

a. the upper limit of the “Y” variable
b. the number of observations in the original data set
c. the growth rate of the dependent variable
d. the lower limit of the dependent variable

 

 

____  47.   When using a growth model under the assumption that constant improvement becomes harder to achieve as growth takes place, the best model to use is

a. an Event model
b. a Logistics Model
c. a Gompertz model
d. a Croston intermittent model

 

 

____  48.   “Event Models” as used in ForecastX

a. are a form of exponential smoothing
b. are a type of growth model
c. are a type of simple regression
d. are a type of moving average

 

 

____  49.   “Events” in an Event model could include

a. seasonality, trends, and cyclicality
b. advertising campaigns, sale prices, and couponing
c. audit dates and forecasting deadlines
d. the first sale date, last sale date, and growth rate for an item

 

 

Smoothing 2

 

 

____  50.   Consider the ForecastX printout above. This is the forecast for a manufactured product.

a. This is a Winter’s Exponential Smoothing model.
b. This is a Holt’s Smoothing model.
c. This is an Event model.
d. This is a Simple Smoothing model.

 

 

____  51.   Consider the ForecastX printout above.

a. There is little trend in the data.
b. There is clear seasonality in the data.
c. The event indices show little (but some) promotional effect.
d. All of the above are correct.

 

 

____  52.   Consider the ForecastX printout above. The seasonal index 4 has a value of 1.14. This indicates

a. that sales in period 4 are usually below average.
b. that sales in period 4 are usually above average.
c. that sales in period 4 are usually quite close to the period average.
d. that sales in period 4 have no seasonal effect.

 

 

____  53.   The Gamma factor above is given as 0.00.

a. This indicates that there is little (or no) seasonality.
b. This indicates that there is little (or no) trend.
c. This indicates that the events have little (or no) effect on sales.
d. This indicates that the model has little (or no) explanatory power.

 

 

____  54.   In the ForecastX model presented above

a. All of the events contribute positively to sales.
b. Some of the events contribute negatively to sales.
c. None of the events contribute negatively to sales
d. None of the events contribute positively to sales.

 

 

____  55.   In event models

a. events are analogous to seasons in a seasonal model.
b. events need not be defined in the forecast period.
c. the researcher is unable to specify the underlying model.
d. “load” and “deload” factors are never used.

 

 

Winters

 

 

____  56.   Consider the Audit Trail statistics for a Winters model above.

a. The Gamma value of 0.03 indicates trend is present.
b. The Gamma value of 0.03 indicates seasonality is present.
c. The Gamma value of 0.03 indicates no trend is present.
d. The Gamma value of 0.03 indicates no seasonality is present.

 

 

____  57.   In the Winters smoothing model above

a. The Beta value of 0.37 indicates a high degree of trend is present.
b. The Beta value of 0.37 indicates a high degree of seasonality is present.
c. The Beta value of 0.37 indicates that the seasonality is positive.
d. The Beta value of 0.37 indicates there is little variation from the average value of the forecasted variable.

 

 

____  58.   In the Winters model shown above index 1 refers to quarter 1 in the data.

a. Thus, quarter 3 is a below average quarter.
b. Thus, quarter 3 is an above average quarter.
c. Thus, quarter 3 is an average quarter.
d. Nothing can be deduced about quarter 3.

 

 

____  59.   In the Winters model above “Decomposition Type”

a. refers to the type of trend used in the model.
b. refers to the type of seasonality used in the model.
c. refers to the calculation method used to estimate the Alpha factor.
d. refers to the calculation method used to estimate the Gamma factor.

 

 

____  60.   The Winters model above

a. could reasonably be used to forecast 4 quarters into the future.
b. should only be used to forecast one quarter into the future.
c. is considered a long range forecasting model.
d. is quite inaccurate and probably should not be used for forecasting.

 

 

Growth

 

 

____  61.   Consider the growth model Audit Trail statistics shown above. The “Maximum” shown here as 1,200.00

a. is a value calculated to be the largest value the model may achieve.
b. is a value set to be the largest value the model may achieve.
c. is a value representing the maximum growth rate possible over the forecast period.
d. is a value representing the square of the maximum growth rate possible over the forecast period.

 

 

____  62.   In the growth model Audit Trail shown above a Gompertz Curve was probably selected because

a. it was harder to achieve constant improvement as the maximum value was approached.
b. it was easier to achieve constant improvement as the maximum value was approached.
c. a “bell shaped” function was expected.
d. the trend was nonlinear.

 

 

____  63.   In the growth model Audit Trail shown above the saturation point is

a. 100 percent.
b. 15.55.
c. 1,200.
d. 66.28

 

 

____  64.   A Logistics Model assumes

a. it is harder to achieve constant improvement as the maximum value was approached.
b. it is easier to achieve constant improvement as the maximum value was approached.
c. a “bell shaped” function is expected.
d. the trend is linear.

 

 

____  65.   In the Bass Model the p coefficient (as used in ForecastX)

a. is the coefficient of imitation.
b. will be little effected by purchasing power.
c. will tend to be lower if the product exhibits significant network effects.
d. will tend to be higher as more disposable income makes it easier to adopt innovations.

 

 

____  66.   The Bass Model

a. is a type of diffusion model.
b. is a form of exponential smoothing.
c. is used for short range forecasting.
d. does not require a limiting value like the logistics model.

 

 

____  67.   When forecasting the adoption of cellular telephones with the Bass Model

a. we should expect little impact from the choice of a market potential.
b. we should expect turning points to be predicted accurately.
c. we should expect relatively high q values because of the nature of the product.
d. we should expect relatively low p values because of the nature of the product.

 

 

medfly

 

 

The first 23 observations in a data set involving the mortality of medflies is shown above. The column titled “living” indicates the number of living flies in each day of the experiment.

 

Consider that you wish to predict the outcome of the experiment only ten days into the experiment. That is, you wish to forecast when the last medfly will expire. You do so with the model shown above.

 

____  68.   What method was used to fit the model to the original ten data points?

a. A Smoothing Model
b. A Moving Average Model
c. A Diffusion Model
d. A Winters Model
e. None of the above are correct

 

 

____  69.   On approximately what date is the medfly population living expected to reach zero?

 

a. February 9th
b. February 15th
c. February 23rd
d. February 28th

 

 

____  70.   The model chosen for this estimation was probably chosen because

a. there may be an offsetting factor such that growth is more difficult to maintain as the endpoint is approached.
b. there is no offsetting factor hindering the attainment of the endpoint.
c. declining values are always estimated using this type of model.
d. growth can never be negative.
e. None of the above are correct.

 

 

____  71.   When specifying the model used above some limits were probably set by the forecaster. These would probably have been

 

a. a minimum value of 1203646 and a maximum value of some “very high number.”
b. a minimum value of 0 and a maximum value of 1203646.
c. a minimum value of 0 and a maximum value of some “very high number.”
d. left to be determined statistically by the forecasting software.

 

 

____  72.   The Model used to estimate the above medfly model was probably

 

a.
b.
c.
d.
e. None of the above are correct.

 

 

 

____  73.   The model above represents the forecast model for a particular UPC of Lysol Disinfectant Spray. The underlying model used here is

 

a. a simple exponential smoothing model.
b. a Holts exponential smoothing model.
c. a Winters exponential smoothing model.
d. an event model.
e. None of the above are correct.

 

 

____  74.   For the Lysol model estimated above

a. about 8% of the variation is sales is explained.
b. the R-Square estimated has no realistic meaning.
c. there are two smoothing constants because this is a Holts model.
d. little weight should be given to its forecasts because the Mean Error is negative.
e. None of the above are correct.

 

 

____  75.   In the Lysol model estimated above

a. forecasts will be inaccurate because two Alpha factors have been mistakenly estimated.
b. all the events had a positive effect on Lysol sales.
c. event number three has the greatest positive effect on sales.
d. all of the events had a negative effect on sales.
e. None of the above are correct.

 

 

____  76.   In an Event Model the term “load”

a. probably refers to a period of reduced prices.
b. probably refers to a period of increased prices.
c. probably refers to the period immediately following a promotion.
d. probably refers to the value of the alpha factor for the event.
e. None of the above are correct.

 

 

smoothing 3

 

____  77.   In running an exponential smoothing model the following results were obtained:

 

 

The Beta value listed above indicates that the model

 

a. is probably unreliable for forecasting.
b. has a very high level smoothing constant.
c. exhibits a rather high degree of trend.
d. exhibits a rather high degree of seasonality.
e. None of the above are correct.

 

 

____  78.   In the smoothing model listed above (assuming January is the first month in the data set)

a. the greatest seasonal variation appears in May.
b. the greatest seasonal variation appears in December.
c. there appears to be little seasonal variation between months.
d. there is almost even variation between months.
e. None of the above are correct. The data is quarterly.

 

 

____  79.   In the smoothing model above, the Gamma coefficient reported

a. indicates a high degree of seasonality.
b. indicates some trend in the data.
c. indicates an almost stationary data set.
d. is statistically insignificant.
e. None of the above are correct.

 

 

____  80.   For the smoothing model shown above, the product that is modeled is probably most like which of the following products in terms of its yearly sales pattern?

a. New housing sales
b. Jewelry sales
c. Mustard sales
d. Human insulin sales
e. None of these products would be similar to the sales pattern exhibited by the smoothing model above.

 

 

____  81.   Consider the smoothing model results shown in the following graph of actual and predicted sales:

 

The darker line above is the actual data and the lighter line is the fitted data.

 

Which of the following would be a likely set of parameters to see in this exponential smoothing estimate?

 

a. Alpha = 0.37, Beta = 0.22, Gamma = 0.01
b. Alpha = 0.05, Beta = 0.00, Gamma = 0.37
c. Alpha = 0.37,  Gamma = 0.01
d. Alpha = 0.44

 

 

____  82.   Consider the Bass model results shown below:

 

This model predicts percentage of adoptions over time for a particular product. The results show

 

a. the product is only poorly forecast with a Bass model.
b. the product has a relatively high innovation rate characteristic.
c. the product has a relatively high imitation rate characteristic.
d. the product exhibits no standard growth pattern.
e. None of the above are correct.

 

 

____  83.   Which of the following statements about any moving-averages series is correct?

a. A moving-averages series can lie consistently above or below the original data, namely, when they are growing or declining exponentially.
b. Such a series will anticipate or prolong changes in the original data and, thus, show a different timing of turning points.
c. Such a series will be extremely sensitive to unusually large or small values in the time series, as any average is bound to be.
d. All are correct.

 

 

____  84.   Which of the following is the best general definition of exponential smoothing?

a. It is a forecasting procedure that produces self-correcting forecasts by means of a built-in adjustment mechanism that corrects for earlier forecasting errors: The technique produces a weighted average of all past time-series values with weights decreasing exponentially as one goes back in time, and the average so constructed serves as a forecast for the next period.
b. It is a procedure that constructs a series of numbers by successively averaging overlapping groups of two or more consecutive values in a time series and replacing the central value in each group by the group’s average.
c. It is a procedure that produces artificial (and, therefore, misleading) waves in a moving- averages series, even when there are no waves in the original time series.
d. None of the above.

 

 

____  85.   An exponential smoothing technique that adds a trend smoothing constant  to the single-parameter exponential smoothing technique is known as

a. two-parameter (or double) exponential smoothing.
b. three-parameter (or triple) exponential smoothing.
c. the easiest way to produce a seasonally adjusted time series.
d. the ratio-to-moving-average method.

 

 

____  86.   The simple moving average technique

 

a. works better for long-range forecasts than short-range forecasts.
b. reacts well to random variations.
c. reacts well to variations that occur for a reason.
d. requires minimal amount of data.

 

 

____  87.   Which of the following is true concerning the smoothing parameter (a) used in exponential smoothing?

 

 

a. a = 0.4 means the forecast for the next period is based on 40% older data and 60% recent data.
b. If a = 0, the forecast is equivalent to the naive forecast.
c. The higher the value of a, the less the effect of smoothing.
d. The higher the value of a, the more the effect of smoothing.

 

 

____  88.   Given demands, D1 = 20, D2 = 16, and D3 = 12, what is F5 using the naive forecasting method?

 

a. F5 = 8
b. F5 = 12
c. F5 = 16
d. Inconclusive from the given data

 

 

Chapter 5

 

Multiple Choice

Identify the choice that best completes the statement or answers the question.

 

____    1.   If the linear assumption in regression is violated?

a. No particular problem results.
b. Predictions of Y can still be made, but the coefficient of determination is invalid.
c. The formulas in regression analysis do not apply.
d. The slope of the sample line must be adjusted upward before it can be used.
e. None of the above.

 

 

____    2.   Which of the following is not an assumption of multiple regression models?

a. The Y values are normally distributed about the multiple regression hyper-plane.
b. All X values are measured on a continuous scale.
c. A linear relationship exists between each X variable and Y.
d. The Y values are independent of each other.
e. Dispersion of points around the regression hyper-plane is constant everywhere on the plane.

 

 

____    3.   A regression of retail sales on disposable income and two interest rates, the prime rate and the short-term savings rate, is likely to have the problem of

a. seasonality.
b. heteroscedasticity.
c. multicollinearity.
d. serial correlation.
e. None of the above.

 

 

____    4.   Perfect multicollinearity is the

a. presence of a perfect linear association among independent variables in the sample.
b. presence of zero linear association among independent variables in the sample.
c. presence of significant covariation between adjacent residuals.
d. absence of significant covariation between adjacent residuals.
e. None of the above.

 

 

____    5.   Dummy variables

a. are used to measure the presence or absence of a certain attribute.
b. can be used to model the effects of seasonality in the data.
c. take on the value of either zero or one.
d. are indicator random variables.
e. All of the above.

 

 

Personnel Test

 

Note:  The next few questions utilize the following information:

 

The personnel department of a large manufacturing firm selected a random sample of 23 workers.  The workers were interviewed and given several tests.  On the basis of the test results, the following variables were investigated: X2 = manual dexterity score, X3 = mental aptitude score, and X4 = personnel assessment score.

 

Subsequently, the workers were observed in order to determine the average number of units of work completed (Y) in a given time period for each worker.  Regression analysis yielded these results:

Y = -212 + 1.90X2 + 2.00X3 + 0.25X4,          R2 = .75.

(.050)      (.060)        (.20)

 

____    6.

The quantities in parentheses are the standard errors of the regression coefficients.  The standard error of the regression is 25, and the standard deviation of the dependent variable is 50.

 

Which variables are making a significant contribution to the prediction of units of work completed at the .01 significance level (two tailed)?

a. All three variables.
b. Manual dexterity and personnel assessment.
c. Manual dexterity and mental aptitude.
d. Personnel assessment.
e. None of the variables are statistically significant.

 

 

____    7.   Which of the following statements is the correct interpretation of the mental aptitude regression coefficient?

a. If we increase mental aptitude by one unit, holding the predictor variables constant, units of work completed will increase by an average of 2.0.
b. If we increase units of work completed by one unit, holding the other predictor variables constant, the mental aptitude score will increase by an average of 2.0.
c. If we increase mental aptitude by one unit, holding the other predictor variables constant, units of work completed will increase by an average of .6.
d. If we increase mental aptitude by one unit, holding the other predictor variables constant, units of work completed will decrease by an average of .6.
e. If we increase mental aptitude by one unit, units of work completed will increase by an average of 2.0.

 

 

____    8.   What percent of the variation of units of work completed can be explained by this model?

a. 50
b. 25
c. 90
d. 60
e. 75

 

 

____    9.   What is the correct estimate for the number of units of work completed by a worker with a manual dexterity score of 100, a mental aptitude score of 80 and a personnel assessment score of 10?

a. 140.5
b. 154.3
c. 105.5
d. 138.0
e. 150.0

 

 

____  10.   What is the table t value to test whether a regression coefficient is statistically significant at the .05 level (one tailed) for this problem?

a. 1.729
b. 2.093
c. 1.725
d. 2.086
e. 1.328

 

 

____  11.   Graphically, a multiple regression model with two independent variables looks like a

a. line.
b. plane.
c. hyperplane.
d. rectangle.
e. quadrilateral.

 

 

____  12.   A multiple regression model using 200 data points (with three independent variables) has how many degrees of freedom for testing the statistical significance of individual slope coefficients?

a. 199.
b. 198.
c. 197.
d. 196.

 

 

____  13.   Using the significance levels reported by ForecastXTM, at what level can we reject a one-sided null relating to a slope coefficient’s statistical significance such that we are 95% confident?

a. .12.
b. .11.
c. .1.
d. .09.
e. None of the above.

 

 

____  14.   What action may reduce multicollinearity when two independent variables have a common trend?

a. Squaring one of the variables.
b. Subtracting one from the other.
c. First-differencing the data.
d. Dividing one by the other.
e. All of the above.

 

 

____  15.   Which of the following is not recommended in selecting the correct set of independent variables for multiple regression?

a. R-squared.
b. Adjusted R-squared.
c. Akaike Information Criterion.
d. Bayesian Information Criterion.
e. None of the above.

 

 

____  16.   How are the AIC and BIC model selection criteria used in the model selection process for multiple regression?

a. AIC is minimized whereas BIC is maximized.
b. AIC is maximized whereas BIC is minimized.
c. Both AIC and BIC are maximized.
d. Both AIC and BIC are minimized.
e. None of the above.

 

 

____  17.   Which of the following is not correct about near multicollinearity?

 

a. It arises when we have two or more independent variables which essentially measure the same effect on the dependent variable.
b. It arises when we have two or more independent variables, which are highly correlated.
c. It is often indicated by a large value of the calculated F statistic for the multiple regression model accompanied by relatively small values of the calculated t-statistics for individual parameters.
d. It is often indicated by a small value of the calculated F statistic for the multiple regression model accompanied by relatively large values of the calculated t-statistics for individual parameters.
e. a. and b. are correct.

 

 

Estimated Demand Function

 

The following is an estimated demand function:

 

Q  =  875  + 6 XA  +  15 Y –  5 P

(125)       (2)         (4)     (-1.2)

 

Where Q is quantity sold, XA is advertising expenditure (in thousands of dollars), Y is income (in thousands of dollars), and P is the good’s price.  The standard errors for each estimate are in parentheses.  The equation has been estimated from 10 years of quarterly data.  The R2 was .92; the F-statistic was 57; the Standard Error of the Estimate (SEE) is 25.

 

____  18.

 

 

According to the common 95 percent level of significance (estimated) for the regression above,

 

a. all variables are probably significant.
b. only price is significant.
c. no variable is significant.
d. unable to determine from information given.

 

 

____  19.   Suppose the values of the explanatory variables next period are: Advertising  =  $100,000; Income  =  $10,000; and Price  = $100.  Using the above fitted regression, what is the predicted value of sales?

a. 2125
b. 1625
c. 1125
d. 1870
e. unable to determine from the information given.

 

 

____  20.   For the above regression, an estimated 95 percent confidence interval around the sales prediction would be

a. 1125  to  1225
b. 1025  to  1125
c. 1425  to  1850
d. 1760  to  1920
e. 1075  to  1175

 

 

____  21.   When autocorrelation is present, which of the following is not a problem?

a. Estimated coefficients may be unreliable.
b. The t and F statistics are no longer applicable.
c. The R-squared tends to be too small.
d. The adjusted R-squared tends to be too large.
e. All of the above.

 

 

____  22.   Which of the following “goodness-of-fit” measures should not be used in the context of multiple regression?

a. The F statistic.
b. The Durbin-Watson statistic.
c. The Coefficient of Determination.
d. The AIC and BIC criteria.
e. None of the above.

 

 

____  23.   The F-statistic in the multiple regression model

a. is used to test for the presence of serial correlation.
b. tests for the presence of first-order autocorrelation.
c. tests for the significance of the R-squared statistic.
d. is used to test for data non-linearity.
e. All of the above.

 

 

____  24.   A potential diagnosis and/or cure for the multicollinearity problem does not include

a. dropping all but one of the highly correlated independent variables from the model.
b. valuing variables in nominal, not real, terms.
c. testing for a high degree of correlation among the independent variables.
d. comparing signs and sizes of estimated coefficients with what is expected on the basis of economic theory.
e. All of the above.

 

 

____  25.   Forecasters who base model selection criteria on the maximization of R2 should

a. be wary that extremely high values of R2 may indicate a definitional relationship rather than causal as required by the multiple regression models.
b. be aware that the simple R-squared measure is suspect when autocorrelation is present.
c. be aware that R-squared can be made arbitrarily large by adding additional explanatory variables to the model.
d. instead use the adjusted R-squared measure.
e. All of the above.

 

 

____  26.   Multicollinearity in a regression model occurs when

a. the Durbin-Watson statistic and the R-squared are correlated.
b. there is some correlation among the residuals and the values of the explanatory variables.
c. there is no correlation between the forecast error in one period and the error in the next period.
d. a nonlinear specification is used.
e. None of the above.

 

 

____  27.   Which statement is not correct?

a. R-squared is a measure of the degree of variability in the dependent variable about its sample mean explained by the regression line.
b. The adjusted R-squared measure should be used in the case of more than one independent variable.
c. The null hypothesis that R2 = 0 can be tested using the F-statistic.
d. Forecasters should select independent variables on the basis of R2.
e. All the above.

 

 

____  28.   When autocorrelation is present, which of the following is a problem?

a. Regression coefficient estimates are biased.
b. The t and F distributions are no longer applicable.
c. The D-W statistic is close to minus one.
d. Spurious regression.
e. None of the above.

 

 

____  29.   The F-test in multiple regression

a. is used to test for the presence of autocorrelation.
b. tests for the presence of first-order autocorrelation.
c. tests the significance of the Durbin-Watson statistic.
d. tests a null involving all regression slope coefficients simultaneously.
e. is used to test the significance of individual coefficients.

 

 

____  30.   The F-statistic reported in standard multiple regression computer packages tests which hypothesis?

a. H0:  b1 ¹ b2 ¹ b3 ¹ .. ¹ bK ¹ 0.
b. H0:  b1 + b2 + b3 + .. + bK = 0.
c. H0:  b1 = b2 = b3 = .. = bK = 0.
d. H0:  The set of independent variables has a significant linear influence on the dependent variable.

 

 

____  31.   The Durbin-Watson statistic

a. is used to test the null hypothesis of first-order autocorrelation.
b. has a t distribution with N – (K+1) degrees of freedom.
c. is the squared value of the F-statistic.
d. is used to test the null of no multicollinearity.
e. None of the above.

 

 

____  32.   Which of the following statements are true?

a. Autocorrelation arises when there is a perfect linear association between the dependent and independent variables.
b. Autocorrelation implies the error terms have differing variances.
c. Autocorrelation causes the estimated regression standard error to be biased.
d. Autocorrelation can be tested using the F-statistic.

 

 

____  33.   The inclusion of seasonal dummy variables to a multiple regression model may help eliminate

a. autocorrelation if the data are characterized by seasonal fluctuations.
b. perfect multicollinearity.
c. near multicollinearity.
d. bias in OLS slope estimates caused by autocorrelation.
e. All of the above.

 

 

____  34.   Consider the following group: R-squared, Adjusted R-squared, Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC).  Which one doesn’t belong with the rest of the others?

a. R-squared.
b. Adjusted R-squared.
c. Akaike Information Criterion (AIC.)
d. Bayesian Information Criterion (BIC.)

 

 

____  35.   Model A has an AIC number of  300 whereas model B has an AIC number of 400 (both models have the same dependent variable).  This suggests that which model is more correctly specified?

a. Model A.
b. Model B.
c. Can’t tell without knowing sample size.
d. Not enough information is provided.

 

 

____  36.   Which of the following is not correct?  Seasonality in a time series data set containing quarterly observations can be handled by

a. using four dummy variables, one for each season.
b. using three dummy variables to represent any three of the quarters.
c. using Winter’s smoothing.
d. deseasonalizing the data and then applying nonseasonal methods.

 

 

____  37.   A company has computed a seasonal index for its quarterly sales.  Which of the following statements about the index is not correct?

a. The sum of the four quarterly index numbers should be 4.
b. An index of .75 for the first quarter indicates that sales were 25 percent lower than the average quarterly sales.
c. In index of 1.1 for the second quarter indicates that sales were 10 percent above the average quarterly sales.
d. The index for any quarter must be between zero and 2.
e. The average index for each of the four quarters should be 1.

 

 

____  38.   How would you model the effect of rain on attendance to a soccer game?

a. Create a dummy variable to represent rain and a second dummy variable to represent no rain.
b. Introduce a variable measuring the inches of rain for a given day.
c. Create a single rain dummy variable.
d. Omit rain days from the data set.
e. All of the above.

 

 

____  39.   Which of the following is probably not a potential cause of data seasonality?

a. Weather.
b. Cultural Traditions.
c. Religious Traditions.
d. Government Behavior.
e. Income.

 

 

____  40.   Quarterly seasonal dummy variables take on values

a. 1 to 4.
b. 1 to 3.
c. 0 to 3.
d. 0 to 4.
e. None of the above.

 

 

____  41.   Including male and female dummy variables in the same regression to represent sex will likely result in

a. near multicollinearity.
b. perfect multicollinearity.
c. serial correlation.
d. heteroscedasticity.
e. All of the above.

 

 

____  42.   In using quarterly time series data, which quarter can serve as the base period for interpretation of dummy variables?

a. Quarter one.
b. Quarter two.
c. Quarter three.
d. Quarter four.
e. Any of the above.

 

 

____  43.   In a regression of sales on income and seasonal dummy variables for a quarterly time series, a negative sign of the quarter 3 dummy variable means

a. sales for quarter three are negative.
b. sales for quarter three are below average.
c. sales for quarter three are below that of the base quarter.
d. sales for quarter three are above average.
e. None of the above.

 

 

____  44.   Which of the following types of models cannot be satisfactorily estimated using ordinary least squares regression (with or without data transformations)?

a. Y = a + b1(X) + b2(X2).
b. Y = a + b1(X) + b2(X3).
c. Y = b(XY).
d. Y = B0Xb.
e. All of the above types can be estimated using OLS.

 

 

____  45.   Which of the following statements is not true?

a. Data nonlinearity can be modeled by adding the square of an independent variable to the regression equation.
b. Many economic relationships are nonlinear due to the concept of diminishing marginal returns.
c. A logarithmic transformation is used to estimate exponential relationships among variables.
d. Cubic polynomials cannot be estimated using any form of multiple regression.
e. All of the above.

 

 

____  46.   Which of the following is not useful advice in using multiple regression to generate forecasts?

a. One should always prefer quantitative models to subjective expertise.
b. Keep the model simple.
c. Use the AIC and BIC measures to help in selecting the appropriate set of independent variables.
d. Focus on model accuracy rather than model fit.
e. All of the above.

 

 

____  47.   Large and complicated forecasting models

a. are expensive to maintain.
b. are hard to explain to upper-level management.
c. tend to be distrusted by management.
d. tend to put too much emphasis on quantitative aspects of  forecasting.
e. All of the above.

 

 

Domestic Car Sales

 

Consider the following multiple regression model of domestic car sales (DCS) where:

 

DCS = domestic car sales

DCSP= domestic car sales price (in dollars)

PR= prime rate as a percent (i.e., 10% would be entered as 10)

Q2= quarter 2 dummy variable

Q3= quarter 3 dummy variable

Q4= quarter 4 dummy variable

 

 

____  48.

Does the regression pass the “first quick check (i.e., economic realism)?”

 

a. No, because the sign of one of the regression coefficients is incorrect.
b. Yes, because the signs of all the regression coefficients are correct.
c. No, because the price variable does not make economic sense to include in the regression.
d. Yes, because the SEE passes its statistical test.

 

 

____  49.   For the domestic car sales regression, which variable coefficients pass the “second quick check (i.e., statistical significance)?”

a. All of the coefficients pass.
b. None of the coefficients pass.
c. Those that pass are DCSP, Q2, and Q3.
d. Those that pass are DCSP, PR, and Q4.
e. Those that pass are Q2, Q3, and Q4.

 

 

____  50.   For the domestic car sales regression above, the “third quick check” shows what (i.e., accuracy)?

a. It shows that more than three-quarters of the variation in DCS is explained.
b. It shows that almost no serial correlation exists.
c. It shows that a great deal of seasonality exists in the data.
d. It shows that a small trend exists in the data.

 

 

____  51.   In the domestic car sales regression above, what evidence do you have of any pattern in the error terms?

a. The SEE indicates a high probability of a pattern in the error terms.
b. The AIC and BIC both indicate a pattern in the error terms.
c. There are no error terms in this regression and so there can be no pattern.
d. The Durbin Watson statistic indicates little pattern in the error terms.

 

 

____  52.   For the domestic car sales regression above, assume that:

 

DCSP = $10,000

PR= 10 percent

and that it is the first quarter of the year.

 

What will DCS be predicted to be by the regression model?

a. 6,545.45
b. 1,858.62
c. 3,466.16
d. 2,071.99

 

 

____  53.   For the domestic car sales regression above, assume that:

 

DCSP = $10,000

PR= 10 percent

and that it is the first quarter of the year.

 

What will be the approximate 95% confidence interval for the DCS prediction?

a. 2313 to 1831
b. 1649 to 2039
c. 2964 to 4126
d. 4620 to 7156

 

 

____  54.   In the domestic car sales function there is evidence of seasonality. How does the regression model show this evidence?

a. With the Durbin Watson statistic.
b. With the t-statistics on the dummy variables.
c. With the SEE.
d. With the F-statistic.
e. With the R-Square.

 

 

____  55.   For the domestic car sales regression the coefficient of determination shows that

a. 120.60 is the error associated with the independent variable.
b. 288.10 will be the error associated with DCS.
c. 3,266.66 will be the most likely value of DCS.
d. 75.64% of the variation in DCS is explained by variation in the independent variables.

 

 

____  56.   The domestic car sales model

a. could be used to forecast DCS at some future date.
b. was estimated using a least squares model.
c. is a linear model.
d. All of the above are true.
e. None of the above are true.

 

 

____  57.   The AIC can be of help in model selection when choosing among

a. coefficients in a multiple regression.
b. appropriate lag structures.
c. different orders of a polynomial regression.
d. all of the above are correct.

 

 

____  58.   The Principle of Parsimony is given in the following statement:

a. overfitting a model is always preferable to underfitting a model.
b. the more independent variables used, the higher the R2.
c. use the BIC as an aid in selecting the appropriate number of observations to use.
d. use the smallest number of parameters necessary to represent the data adequately.

 

 

____  59.   The Akaike rule of thumb is

a. if the AIC is between 0 to 2 from the “best” model, there is substantial support for both models.
b. if the AIC is between 4 and 7 from the “best” model there is substantial support for both models.
c. if the AIC is negative neither model can be optimal.
d. if the AIC is less than 10 there is substantial support for neither model.

 

 

____  60.   Use the Akaike criterion

a. in observational studies when there are large numbers of variables.
b. in exploratory studies when you have no a priori hypotheses.
c. in experimental studies when you are testing few effects.
d. to select the correct degrees of freedom to use in evaluating summary statistics.

 

 

ForecastX Regressions

 

Exhibit #1

 

 

Exhibit #2

 

 

Consider the two regressions presented above in answering the following questions.

 

____  61.

In the simple regression above

 

a. the first quick check fails.
b. the second quick check fails.
c. there does not appear to be first order serial correlation.
d. the independent variable is Sales.

 

 

____  62.   Consider the two regressions shown above.

a. The simple regression is probably overfit.
b. The simple regression is probably underfit.
c. The multiple regression has only one significant independent variable.
d. The multiple regression probably suffers from rampant multicollinearity.

 

 

____  63.   Consider the two regressions shown above. For the multiple regression above the Akaike Information Criterion indicates

a. that the multiple regression is less optimal than the simple regression.
b. that approximately 86% of the variation in sales is explained.
c. that the addition of an income variable resulted in a more optimal model.
d. that the researcher should give substantial consideration to both the simple and multiple regression.

 

 

____  64.   Consider the two regressions shown above.

a. The simple regression probably suffers from specification error.
b. The multiple regression probably suffers from specification error.
c. The simple regression probably suffers from multicollinearity.
d. The multiple regression probably suffers from autocorrelation.

 

 

Bottled Water

 

Shown above is the demand for bottled water in thousands of Gallons for 110 consecutive weeks. From weeks 75 through 84 there was a severe flood in the area. Shown below are two regression results using this data.

 

Regression #1

 

 

Regression #2

 

 

____  65.   Consider the two regressions shown above. Which of the following statements is true?

 

a. All independent variables are significant at the 99% level in both regressions.
b. The coefficient on the “Week” index has an incorrect sign in both regressions.
c. Neither regression seems to suffer from serial correlation.
d. Both regressions explain more than 90% of the variation in “Demand.”
e. None of the above statements are true.

 

 

____  66.   Examine the Akaike Information Criterion for both Regression #1 and Regression #2 above.

a. Both AIC measures are statistically significant at the 95% level
b. Neither AIC measure is significant at the 95% level.
c. Only the Regression #2 AIC is significant at the 95% level
d. Both AIC measures are significant at the 99% level.
e. None of the above statements are true.

 

 

____  67.   Consider the two regressions immediately above. The “Intervention” variable in Regression #2 represents the flood period by taking on a value of “1” when there is a flood during that week and a value of zero otherwise. How would you interpret the coefficient of the “Intervention” variable in Regression #2?

a. For each week in which flood occurred 24.70 more bottled water is demanded than in the first week in the time series.
b. For each week in which flood occurred 24.70 more bottled water is demanded than in the week immediately preceding the beginning of the flood.
c. For each week in which flood occurred 24.70 more bottled water is demanded than in the average week in the time series.
d. For each week in which flood occurred 24.70 more bottled water is demanded than in the average nonflood week in the time series.

 

 

____  68.   Consider Regression #2 immediately above. You should use the rule of thumb taught in class to answer this question. In order to create the approximate 95% confidence interval for an estimate of demand

a. 19.27 must be added to and subtracted from the point estimate.
b. two times 19.27 must be added to and subtracted from the point estimate.
c. 3.70 must be added to and subtracted from the point estimate.
d. two times 3.70 must be added to and subtracted from the point estimate.
e. None of the above are correct statements of how to construct the approximate 95% confidence interval.

 

 

____  69.   Consider the two regression models immediately above. When comparing these two regressions with respect to accuracy

a. it is correct to choose the model that minimizes RMSE but maximizes MAPE.
b. it is incorrect to use either RMSE or MAPE; only MAPE can be used across different regression models.
c. it is correct to choose the model that minimizes both RMSE and MAPE.
d. only the F Statistic should be compared across two different regressions.
e. None of the above statements concerning accuracy are correct.

 

 

____  70.   Consider the two regressions immediately above. In using the Akaike Information Criterion and the Bayesian Information Criterion “closeness” counts (as in the game of horseshoes). Using the rule of thumb we learned in class regarding the interpretation of the information criteria we could correctly say

a. both models have “substantial support” because the AIC and BIC are so close in value to one another.
b. only Regression #1 has “substantial support” because of the differences in the values of the AIC and BIC.
c. only Regression #2 has “substantial support” because of the differences in the values of the AIC and BIC.
d. neither regression has “substantial support” since both regressions have AIC and BIC values substantially above 100.

 

 

____  71.   The Akaike Information Criterion (AIC) may be used

 

a. to determine the correct set of independent variables in a regression.
b. to determine the correct “form” of a regression.
c. to determine the best lag structure in a regression.
d. All of the above are correct.
e. None of the above are correct.

 

 

Television Add Yields

 

Television add yields are sometimes measured in millions of retained impressions. The following two regressions model the effectiveness of adds for 21 consumer products. The data is from The Wall Street Journal, March 1, 1984.

 

The variables collected for each of the 21 products are:

SPENDING: TV advertising budget, ($ millions) MILIMP: Millions of retained impressions, MILIMP Sqrd: Millions of retained impressions squared.

 

 

A scatterplot of the data used appears below:

 

Regression #1

 

Regression #2

 

 

____  72.   Regression #1 above

 

a. has a better Akaike score than Regression #2.
b. has a better Coefficient of Variation score than Regression #2.
c. suffers from a serious multicollinearity problem.
d. suffers from a serious autocorrelation problem.
e. None of the above are true.

 

 

____  73.   Regression #2 above for TV Add Yields

 

a. may suffer from autocorrelation.
b. is superior to Regression #1 in terms of Akaike score.
c. is inferior to Regression #1 in terms of the Adjusted Coefficient of Determination score.
d. has P-values that are lower than acceptable.
e. None of the above are true.

 

 

____  74.   For the TV Add Yield regressions above

 

a. the standard error of the estimate is better for Regression #1.
b. the standard error of the estimate is better for Regression #2.
c. a forecast confidence interval will be wider for Regression #2.
d. only Regression #1 has acceptable t-statistics.
e. None of the above are true.

 

 

Education

 

A question of interest to many educators and college admissions officers is whether and to what extent high school students’ performance on standardized tests can forecast their performance in college. That is, does how well a student do on a test before entering college bear any relationship to his/her performance in college.

Jeffrey Wooldridge used data collected by Christopher Lemmon at Michigan State University to examine this question. The data contain information about students’ final GPA for all years of college, their performance on the ACT (a standardized test commonly used for college admissions), and their high school GPA (labeled hsGPA below). They estimated models to examine the links between GPA in college and these two separate pre-college measures.

Estimating their regression equation in a statistical software package yielded the following results:

Note: “_cons” is the constant term in the regression.

The dependent variable is “college GPA” shown as colGPA above.

 

____  75.   According to this regression, the most predictive variable for forecasting college GPA was

a. high school GPA.
b. the ACT score
c. Neither variable was predictive.
d. Both variables were equally predictive.
e. None of the above are correct.

 

 

____  76.   In the Education regression ACT is best described as a(n)

a. independent variable.
b. dependent variable.
c. constant coefficient.
d. variable coefficient.

 

 

____  77.   In the Education regression college GPA is best described as a(n)

a. independent variable.
b. dependent variable.
c. constant coefficient.
d. variable coefficient.

 

 

____  78.   The internal auditor of a bank has developed a multiple regression model which has been used for a number of years to forecast the amount of interest income from commercial loans. During the current year, the auditor applies the model and discovers that the adjusted R2 value has decreased dramatically, but otherwise the model seems to be working okay. Which of the following conclusions are justified by the change?

a. Changing to a cross-sectional regression analysis should cause the adjusted R2 to increase.
b. Regression analysis is no longer an appropriate technique to estimate interest income.
c. Some new factors, not included in the model, are causing interest income to change.
d. A linear regression analysis would increase the model’s reliability.

 

 

Lackland

 

Lackland Ski Resort uses multiple regression to forecast ski lift revenues for the next week based on the forecasted number of days with temperatures above 10 degrees and predicted number of inches of snow. The following function has been developed:

 

Sales = 10,902 + 255 (number days predicted above 10 degrees) +

300 (number of inches of snow predicted)

 

Other information generated from the analysis include

 

Adjusted R2  = .6789

Standard Error of the Estimate (SEE) = 1,879

F-statistic = 6.279 with a significance of .049

 

____  79.

Which variable(s) in this function is (are) the dependent variable(s)?

a. Predicted number of days above 10 degrees.
b. Predicted number of inches of snow.
c. Revenue.
d. Predicted number of days above 10 degrees and predicted number of inches of snow.

 

 

____  80.   Assume that the management predicts the number of days above 10 degrees for the next week to be 6 and the number of inches of snow to be 12. Calculate the predicted amount of revenue for the next week.

a. $10,902
b. $11,362
c. $16,032
d. $20,547

 

 

____  81.   Which of the following represents an accurate interpretation of the results of Lackland’s regression analysis?

a. 6.729% of the variation in revenue is explained by the predicted number of days above 10 degrees and the number of inches of snow.
b. The relationships are not significant.
c. The predicted number of days above 10 degrees is a more significant variable than the number of inches of snow.
d. 67.89% of the variation in revenue is explained by the predicted number of days above 10 degrees and the number of inches of snow.

 

 

____  82.   Assume that Lackland’s model predicts revenue for a week to be $13,400. Calculate the 95% confidence interval for the amount of revenue for the week. (The 95% confidence interval corresponds to the area representing 2.3436 deviations from the mean.)

a. $13,400 ± 6,279
b. $13,400 ± 4,404
c. $13,400 ± 6,786
d. $13,400 ± 8,564

Chapter 7

 

Multiple Choice

Identify the choice that best completes the statement or answers the question.

 

____    1.   Why are Box-Jenkins models often referred to as “black boxes?”

a. They ignore causal variables.
b. They use regression analysis in non-standard ways.
c. They evaluate forecast accuracy different from regression models.
d. They are difficult to understand.
e. All of the above.

 

 

____    2.   Which of the following is not a potential advantage to using ARIMA models to generate forecasts?

a. They are useful when a set of explanatory variables cannot be identified.
b. They are useful when the only data available are the variable to be forecast.
c. They determine a great deal of information about a time series.
d. They are especially useful for long-term forecasts.
e. All of the above are potential advantages.

 

 

____    3.   What is a key difference between ARIMA-type models and multiple regression models?

a. The dependent variable.
b. Attention to data trend and seasonality.
c. Attention to serial correlation.
d. Use of data of the explanatory variables.
e. None of the above.

 

 

____    4.   In the model selection process for ARIMA-type models, the ultimate goal is to find an underlying model that

a. explains the dependent variable.
b. leads to non-random errors.
c. produces white noise forecast errors.
d. models the nonlinear components in a time series.
e. None of the above.

 

 

____    5.   If it is found that the forecast errors from an ARIMA-type model exhibit serial correlation, the model

a. is not an adequate forecasting model.
b. is a candidate for adding another explanatory variable.
c. almost surely contains seasonality.
d. is a candidate for Cochrane-Orcutt regression.
e. All of the above.

 

 

____    6.   “Black box” in the ARIMA model methodology does not refer to

a. autoregressive models.
b. moving average models.
c. causal models.
d. mixed autoregressive-moving-average models.
e. All of the above.

 

 

____    7.   “White noise” refers to model forecast errors that are

a. normally distributed.
b. non-normal.
c. serially independent.
d. heteroscedastic.
e. None of the above.

 

 

____    8.   The ARIMA model selection process seeks to find that underlying model which removes

a. all deterministic components from the data.
b. data trend.
c. data seasonality.
d. any serial correlation in the data.
e. All of the above.

 

 

____    9.   Which of the following model is not considered as a potential correct “black box” in Box-Jenkins modeling?

a. MA(1) models.
b. Exponential smoothing models.
c. Time-trend regression models.
d. Autoregressive models.
e. None of the above are considered as a potential correct “black box.”

 

 

____  10.   Which of the following is not in the moving-average class of Box-Jenkins models?

a. Yt = et + W1et-1 + W2et-2.
b. Yt = et + W1et-1.
c. Yt = Xt + et + et-1.
d. Yt = et + 0.7et-1.

 

 

____  11.   Moving-average models are best described as

a. simple averages.
b. non-weighted averages.
c. weighted averages of white noise series.
d. weighted averages of non-normal random variates.
e. None of the above.

 

 

____  12.   Autocorrelation and partial autocorrelation functions differ in

a. what series is being analyzed.
b. their length.
c. diagnostic ability to identify ARIMA models.
d. what is being held constant in the observed correlogram.
e. All of the above.

 

 

____  13.   For a moving-average solution to a forecasting problem, the autocorrelation plot should _____ and the partial autocorrelation plot should _____.

a. slowly approach zero; slowly approach zero.
b. dramatically approach zero; exponentially approach one.
c. slowly approach one; and cyclically approach zero.
d. dramatically cut off to zero; decline to zero wither monotonically or in a wavelike manner.
e. None of the above.

 

 

____  14.   The autocorrelation function correlogram should show spikes close to ____ lags if a moving-average type model generates the true data.

a. One.
b. Two.
c. Three.
d. Four.
e. All of the above.

 

 

____  15.   Which of the following patterns of the partial autocorrelation function correlogram is inconsistent with an underlying moving-average data process?

a. Exponentially declining to zero.
b. Cyclically declining to zero.
c. Positive at first, then negative and increasing to zero.
d. Negative at first, then positive and declining to zero.
e. None of the above.

 

 

____  16.   The autocorrelation function of a time series shows coefficients significantly different from zero at lags 1 through 4.  The partial autocorrelation function shows one spike and monotonically increases to zero as lag length increases.  Such a series can be modeled as a _____ model.

a. MA(1).
b. MA(2).
c. MA(3).
d. MA(4).
e. None of the above.

 

 

____  17.   A time series that can be best represented as a MA(2) model has a partial autocorrelation function that

a. exponentially declines to zero as lag length increases.
b. cyclically declines to zero as lag length increases.
c. has one large negative spike and then goes to zero.
d. has one large positive spike and then goes to zero.
e. All of the above.

 

 

____  18.   The order of a moving-average (MA) process can best be determined by the

a. Durbin-Watson statistic.
b. Box-Pierce chi-square statistic.
c. autocorrelation function.
d. partial autocorrelation function.
e. All of the above.

 

 

____  19.   Which of the following is not in the autoregressive class of Box-Jenkins models?

a. Yt = A1Yt-1 + A2Yt-2 + et.
b. Yt = et + W1et-1.
c. Yt – Yt-1 = et.
d. Yt =  0.1Yt-1 + et.
e. All of the above.

 

 

____  20.   Autoregressive models are best described as

a. simple averages of lagged values of the series.
b. weighted averages of lagged series values plus white noise.
c. weighted average of white noise series.
d. weighted averages of normal random variates.
e. None of the above.

 

 

____  21.   An autocorrelation and partial autocorrelation function for an AR-type process differs from that of a MA-type process in

a. what series is being analyzed.
b. their length.
c. diagnostic ability to access a moving-average model.
d. that they are opposites.
e. All of the above.

 

 

____  22.   For an autoregressive model solution to a forecasting problem, the autocorrelation plot should _____ and the partial autocorrelation plot should _____.

a. gradually approach zero, dramatically cut off to zero.
b. dramatically approach zero, exponentially approach one.
c. slowly approach one, and cyclically approach zero.
d. dramatically cut off to zero, decline to zero either monotonically or in

a wavelike manner.

e. None of the above.

 

 

____  23.   The autocorrelation function correlogram should show significant correlation (spikes) at lags of _____ if an autoregressive-type model generates the true data.

a. One.
b. Two.
c. Three.
d. Four.
e. None of the above.

 

 

____  24.   Which of the following patterns of the partial autocorrelation function correlogram is inconsistent with an underlying autoregressive data process?

a. Exponentially declining to zero.
b. Cyclically declining to zero.
c. Positive at first, then negative and increasing to zero.
d. Negative at first, then positive and declining to zero.
e. All of the above.

 

 

____  25.   The partial autocorrelation function shows one spike at lag length one.  Such a series can be modeled as a _____ model.

a. AR(1).
b. AR(2).
c. AR(3).
d. AR(4).
e. None of the above.

 

 

____  26.   A time series that can be best represented as an AR(2) model has a partial autocorrelation function that

a. exponentially declines to zero as lag length increases.
b. cyclically declines to zero as lag length increases.
c. has one large negative spike and then goes to zero.
d. has one large positive spike and then goes to zero.
e. None of the above.

 

 

____  27.   The order  “p” of an autoregressive (AR) process can best be determined by the

a. Durbin-Watson statistic.
b. Box-Pierce chi-square statistic.
c. autocorrelation function.
d. partial autocorrelation function.
e. All of the above.

 

 

____  28.   Which of the following is not an ARMA(p, q) model?

a. Yt = et + W1Yt-1 + W2Yt-2.
b. Yt = et + W1Yt-1.
c. Yt = Yt-1 + et + et-1.
d. Yt = Yt-1 + 0.7et-1.
e. None of the above.

 

 

____  29.   Mixed moving-average models of order (1, 1) have spikes exhibited in

a. the autocorrelation function.
b. the partial autocorrelation function.
c. both autocorrelation and partial-autocorrelation functions.
d. neither the autocorrelation and partial-autocorrelation functions.
e. None of the above.

 

 

____  30.   ARMA(p, q) models have autocorrelation and partial autocorrelation functions that

a. may both show spikes.
b. may both show monotonically declining estimates.
c. may look amazingly similar.
d. may look quite dissimilar in the nature of adjustment.
e. All of the above.

 

 

____  31.   For an ARMA(1,2) solution to a forecasting problem, the autocorrelation plot should have  _____  spike(s) and the partial autocorrelation plot should have _____. spike(s)?

a. 1,2.
b. 2,1.
c. 1,1.
d. 2,2
e. None of the above.

 

 

____  32.   The autocorrelation function correlogram should show spikes close to ____ lags if an ARMA (2, 3) -type model generates the true data.

a. One.
b. Two.
c. Three.
d. Four.
e. None of the above.

 

 

____  33.   The partial-autocorrelation function correlogram should show spikes close to ____ lags if an ARMA (2, 3) -type model generates the true data.

a. One.
b. Two.
c. Three.
d. Four.
e. None of the above.

 

 

____  34.   Which of the following patterns of the partial autocorrelation function correlogram is inconsistent with an underlying ARMA data process?

a. Exponentially declining to zero.
b. Cyclically declining to zero.
c. Positive at first, then negative and increasing to zero.
d. Negative at first, then positive and declining to zero.
e. None of the above.

 

 

____  35.   The autocorrelation function of a time series shows coefficients significantly different from zero at lags 1 through 4.  The partial autocorrelation function shows one spike and monotonically increases to zero as lags length increases.  Such a series can be modeled as a _____ model.

a. ARMA(1, 4).
b. ARMA(2, 4).
c. MA(3).
d. ARMA(4, 1).
e. None of the above.

 

 

____  36.   A time series that can be best represented as an ARMA(2, 0) model has a partial autocorrelation function that

a. have no significant lags.
b. slowly declines to zero as lag length increases.
c. has one large negative spike and then goes to zero.
d. has one large positive spike and then goes to zero.
e. None of the above.

 

 

____  37.   The order of an ARMA(p, q) process can best be determined by the

a. number of AR and MA terms that are significant.
b. Box-Pierce chi-square statistic.
c. autocorrelation function alone.
d. partial autocorrelation function alone.
e. None of the above.

 

 

____  38.   Which of the following are incorrect?

a. Spikes in the partial-autocorrelation function indicate moving-average terms.
b. Spikes in the autocorrelation function indicate autoregressive terms.
c. Most economic data can be modeled as a higher-order ARMA(p, q) model.
d. For an ARMA(p, q) model, both the autocorrelation and partial-autocorrelation functions show abrupt stops.
e. All of the above.

 

 

____  39.   Which of the following is a stationary time series?

a. A series in which consecutive values depend only on the interval of time between them.
b. A series whose mean is constant over time.
c. A series with no trend.
d. A series whose autocorrelation function shows no significant spikes.
e. All of the above.

 

 

____  40.   Which of the following is not a way to induce stationarity out of non-stationary data?

a. First-difference the original series.
b. Second-difference the original series.
c. Transform the original series using logarithms.
d. Examine the data in percentage terms.
e. None of the above.

 

 

____  41.   ARMA models applied to nonstationary data are called

a. ARIMA(p, q) models.
b. ARMA(p, d, q) models.
c. ARIMA(p, d, q) models.
d. MA(p, q) models.
e. MA(d, q) models.

 

 

____  42.   Integration refers to the

a. moving-average order of a time series.
b. autoregressive order of a time series.
c. number of differences required to induce data stationarity.
d. fit of an ARIMA model.
e. None of the above.

 

 

____  43.   Most economic time series are integrated of what order?

a. Zero.
b. One.
c. Two.
d. Four.
e. None of the above.

 

 

____  44.   What transformation will transform any trend in variance to a trend in the mean of a time series?

a. First-differencing the data.
b. Squaring the data.
c. Taking natural logarithms of the data.
d. Second-differencing the data.
e. All of the above.

 

 

____  45.   Which of the following models utilizes a transformed series to induce a stationary series?

a. ARIMA(1, 0, 1).
b. ARIMA(1, 0, 0).
c. ARIMA(1, 1, 1).
d. ARIMA(0, 0, 1).
e. None of the above.

 

 

____  46.   Which of the following is not a way to generate stationarity data out of non-stationary data?

a. First-difference the original series.
b. Second-difference the original series.
c. Transform the original series using logarithms.
d. Examine a non-linear form of the model.
e. All of the above.

 

 

____  47.   Which of the following best describes the autocorrelation function (ACF) of a nonstationary time series?

a. The ACF has several significant spikes.
b. The ACF has coefficients that very gradually go to zero.
c. The ACF has a spurious pattern of spikes as lags increase.
d. The null of zero autocorrelation is rejected for a significant amount of lags.
e. All of the above.

 

 

____  48.   Which of the following is not a characteristic of a time series best represented as an ARIMA(3,0,1) model?

a. The original series is stationary.
b. The autocorrelation function has one dominant spike.
c. The partial autocorrelation function has one dominant spike.
d. The partial autocorrelation function has three spikes.
e. None of the above.

 

 

____  49.   Which of the following is not a first-step in the ARIMA model selection process?

a. Examine the autocorrelation function of the raw series.
b. Examine the partial autocorrelation function of the raw series.
c. Test the data for stationarity.
d. Estimate an ARIMA(1,1,1) model for reference purposes.
e. All of the above.

 

 

____  50.   Which of the following rules is not a useful first-step in the ARIMA model selection process?

a. If the autocorrelation function stops after q spikes, the appropriate model is a MA(q) type.
b. If the partial autocorrelation function stops after p spikes, then the appropriate model is an AR(p) type.
c. If the autocorrelation function does not rapidly approach zero, then first-difference the data.
d. If the partial autocorrelation function quickly approaches zero, then data first differencing may be recommended.
e. All of the above.

 

 

____  51.   The third-step of the ARIMA model selection process is to diagnose whether the correct model has been chosen.  Which of the following is not used in this diagnostic process?

a. The autocorrelation function of the forecast errors.
b. The partial autocorrelation function of the forecast errors.
c. The Ljung-Box Z statistic.
d. The chi-square distribution.
e. All of the above.

 

 

____  52.   The Q-statistic

a. is based on the estimated autocorrelation function.
b. is used to test whether a series is white noise or not.
c. follows the chi-square distribution.
d. tests whether the residual autocorrelations as a set are significantly different from zero.
e. All of the above.

 

 

____  53.   Using the Ljung-Box statistic applied to a sample of 30 forecast errors, we cannot reject the null of a white noise process if the sample Q-value is less than ____ at the 10% level of significance.

a. 10.
b. 20.
c. 30.
d. 40.
e. None of the above.

 

 

____  54.   The Q-statistic follows which probability distribution?

a. Normal.
b. Standard Normal.
c. t distribution.
d. F distribution.
e. None of the above.

 

 

____  55.   The diagnostic step in the Box-Jenkins model selection process essentially examines the forecast errors for

a. trend.
b. serial correlation.
c. independence.
d. white noise.
e. All of the above.

 

 

____  56.   What is the null hypothesis being tested using the Ljung-Box statistic?

a. The set of autocorrelations is jointly equal to zero.
b. The set of autocorrelations are jointly not equal to zero.
c. The set of autocorrelations are jointly equal to one.
d. What is the null hypothesis being tested using the Ljung-Box statistic?
e. All of the above.

 

 

____  57.   What problem arises when applying ARIMA-type models to highly seasonal monthly data?

a. Autocorrelation.
b. Heteroscedasticity.
c. Extremely high-order AR and MA processes.
d. Stationarity.
e. All of the above.

 

 

____  58.   Besides using sophisticated ARIMA-type models capable of internally handling data seasonality, an alternative is to use

a. seasonal dummy variables.
b. trend dummy variables.
c. deseasonalized data, and then reseasonalize to generate forecasts.
d. Holt’s smoothing.
e. All of the above.

 

 

____  59.

What ARIMA model is suggested by the above correlogram

 

a. ARIMA (2,0,2)
b. ARIMA (4,0,2)
c. ARIMA (2,0,4)
d. ARIMA (0,0,2)

 

 

____  60.

What ARIMA model is suggested by the correlogram above?

 

 

a. ARIMA (0,1,0)
b. ARIMA (0,1,1)
c. ARIMA (1,1,0)
d. ARIMA (1,0,1)

 

 

____  61.

The correlogram above suggests what type of ARIMA model?

 

a. ARIMA (1,0,1)
b. ARIMA (2,0,1)
c. ARIMA (1,0,2)
d. ARIMA (0,2,1)

 

 

 

 

Electricity Usage Data (144 monthly observations):

 

 

Correlogram of the original electricity usage data:

 

ARIMA Model:

 

____  62.   In the electricity usage data above

 

a. the chosen ARIMA model took into account seasonality.
b. the chosen ARIMA model does not include any adjustment for seasonality.
c. the chosen ARIMA model appears to suffer from autocorrelation.
d. the chosen ARIMA model appears to suffer from multicollinearity.

 

 

____  63.   In the electricity usage model above

 

a. the residuals do not appear to be white noise.
b. the “Q” statistic is not statistically significant.
c. there is one autoregressive term.
d. there are no seasonal terms.

 

 

____  64.

Consider the ARIMA model specified above.

 

a. This is an MA1 model.
b. This is an AR1 model.
c. There is one degree of normal differencing used.
d. The model adjusts for seasonality.

 

 

____  65.

The ARIMA model above represents an analysis of data with 200 observations. Is the “Q” statistic acceptable?

 

a. Yes, because the critical value is about 26.
b. Yes, because the critical value is about 32.
c. No, because the critical value is about 17.
d. No, because the critical value is about 6.

 

 

____  66.

The ARIMA model above was estimated from 200 observations of data. Twelve lags were used to calculate the “Q” statistic.

 

The ARIMA model above would require how many degrees of freedom in the test statistic to determine if the model is appropriate?

 

a. 3
b. 6
c. 10
d. 14

 

 

____  67.

The ARIMA model above was estimated using 78 quarterly observations. Using the appropriate test, examine whether this is an appropriate model.

 

a. The model is appropriate because the critical value is about 17.
b. The model is appropriate because the critical value is about 33.
c. The model is inappropriate because the critical value is about 8.
d. The model is inappropriate because the critical value is about 12.

 

 

____  68.

The correlogram above was calculated from the residuals to an ARIMA model that analyzed quarterly data.

 

a. The model appears to have produced white noise.
b. The model does not seem to be an appropriate model.
c. The model appears to be seasonal.
d. The model could be an AR16.

 

 

Chapter 9

 

Multiple Choice

Identify the choice that best completes the statement or answers the question.

 

Decile-Wise

 

A data mining routine has been applied to a transaction dataset and has classified 88 records as fraudulent (30 correctly so) and 952 as nonfraudulent (920 correctly so).

 

The decile-wise lift chart for a transaction data model:

 

 

____    1.   Consider the decile-wise lift chart above. Interpret the meaning of the first and second bars from the left.

a. The first variable in the model is more predictive than the second variable.
b. These bars are never interpreted for the validation dataset; they are only interpreted for the training dataset.
c. Since only two bars rise above unity little explanatory power is exhibited by the model.
d. The first two bars show that this model outperforms a random assignment.

 

 

____    2.   Consider the decile-wise lift chart above. An analyst comments that you could improve the accuracy of the model by classifying everything as nonfraudulent. What will the error rate be if you follow her advice?

a. The error rate will increase.
b. The error rate will decrease.
c. The change in the error rate cannot be determined.
d. The error rate will arbitrarily change.

 

 

____    3.   Which of the following situations represents the confusion matrix for the transactions data mentioned above?

 

a. A
b. B
c. C
d. D

 

 

____    4.   What is the classification error rate for the following confusion matrix?

 

 

 

 

a. 2.2%
b. 0.82%
c. 10%
d. 0.21%
e. Impossible to determine from information given.

 

 

____    5.   Consider the Toyota Corolla data below:

 

 

Which variable is a dummy variable?

 

a. Fuel_Type
b. Color_Black
c. KM
d. HP

 

 

____    6.   Which of the variables below (from the Toyota Corolla dataset) is a categorical variable?

 

a. Fuel_Type
b. Color_Black
c. KM
d. HP

 

 

Flight Delays Data (Naive Bayes Model)

 

N.B.

Success = 1 = Delayed

            Failure = 0 = Ontime

 

 

 

 

____    7.   Using the Flight Delays data above that was computed using a Naive Bayes Model, calculate the ontime probability for the following flight:

 

Carrier = DL

Day of Week = 7

Departure Time = 1000 – 1059

Destination = LGA

Origin = DCA

Weather = 0

 

a. 87%
b. 92%
c. 95%
d. 97%
e. 99%

 

 

____    8.   Consider the following confusion matrix.

 

 

 

How much better did this data mining technique do as compared to a naive model?

 

a. no better than a naive model.
b. 1.2% better than a naive model.
c. 5.6% better than a naive model.
d. 7.8% better than a naive model.
e. 10.1% better than a naive model.

 

 

____    9.   “Bayesian Probability” as used in the Naive Bayes Model

a. uses naive probabilities to estimate class probabilities.
b. uses only a single classifying variable to estimate the class probabilities.
c. uses simple probabilities instead of conditional probabilities.
d. uses derived probabilities to obtain class probabilities.

 

 

____  10.   “Overfitting” refers to

a. estimating a model that explains the data points perfectly and leaves no error but that is unlikely to be accurate in prediction.
b. using too many independent variables or classifiers in a model.
c. the process used to test data mining models for accuracy.
d. the estimation or scoring of new data.

 

 

____  11.   How does a “k-nearest neighbor” model work?

a. It uses conditional probabilities to estimate the prior probability of interest.
b. It uses geometric distances from observations in the data to select a class for an unknown.
c. It uses a dichotomous dependent variable estimated with any type of independent variable.
d. It is based upon the concept of algorithmic minimization.

 

 

____  12.   A “training data set” is

a. used to compare models and pick the best one.
b. used to build various models of interest.
c. used to assess the performance of the chosen model with new data.

 

 

____  13.   A “validation data set” is

a. used to compare models and pick the best one.
b. used to build various models of interest.
c. used to assess the performance of the chosen model with new data.

 

 

Logistic Regression

 

The following diagram is a Logistics Regression coefficient table for the UniversalBank data. The “Y” variable is the dichotomous variable is Loan Offer (success =1). The multiple R2 for this Logistics Regression is reported as 0.6544.

 

 

____  14.   For the Logistics Regression Model above, the positive coefficients for dummy variables CD Account, EducGrad, and EducProf

a. are associated with higher probabilities of accepting the loan offer.
b. are insignificant because of their p-values and therefore irrelevant.
c. have Odds that are too high to be considered relevant.
d. are proved to be causally related to the loan offer variable.

 

 

____  15.   Consider the Logistic Regression Model above for the UniversalBank data. The coefficient on the continuous variable Income means that

a. Income is causally related to the loan offer variable.
b. Income is irrelevant because of its p-value.
c. higher values of Income are associated with greater probability of accepting the loan offer.
d. Income is likely not associated with the loan offer variable.

 

 

____  16.   For the Logistic Regression above using the UniversalBank data, the R2 reported by XLMiner™ was 0.6544.  The lift chart was given as:

 

a. Neither the lift chart nor the R2 indicate a high degree of confidence in the model.
b. Both the lift chart and the R2 indicate a high degree of confidence in the model.
c. The lift chart indicates high confidence in the model but the R2 is at odds with this conclusion.
d. Because only a single bar of the decile-wise lift chart is above 1, there is little confidence in the model.

 

 

____  17.   Consider the Logistics Regression Model above for the UniversalBank data. Which variable or variables appear to be insignificant?

a. Only Age.
b. Age and Experience.
c. Income, CD Account, EducGrad, and EducProf.
d. All variables with “odds” less than zero.

 

 

____  18.   Consider the Logistic Regression Model above for the UniversalBank data.

a. Strong collinearity can lead to problems with the model.
b. Strong correlation among the independent variables is not a difficulty when using Logit.
c. The Logit Model automatically adjusts for collinearity.
d. None of the above are correct.

 

 

RidingLawnmower Problem

 

 

____  19.   Consider the RidingLawnmower data above and the K-Nearest Neighbor Model results shown.

a. The optimal value of k was 8 because there was an almost even split between owners and non-owners.
b. The optimal value of k should always be less than the number of independent variables.
c. The optimal value of k is the number of “neighbors” the model has chosen to poll when selecting a category choice.
d. The optimal value of k is irrelevant since we most often let k=1.

 

 

____  20.   Examine the RidingLawnmower data above. Consider a new household with $60,000 income and lot size 20,000 ft. Using k=1, would you classify this individual as an owner or non-owner?

a. Owner
b. Non-owner
c. Impossible to tell

 

 

____  21.   Consider the RidingLawnmower data above. Consider a new household with $60,000 income and lot size 20,000 ft. Using k=3, would you classify this individual as an owner or non-owner?

 

a. Owner
b. Non-owner
c. Impossible to tell

 

 

____  22.   Consider the RidingLawnmower data above. Why would the model choose a higher value of k than k=1?

 

a. The model will rarely choose higher values of k unless there is collinearity in the independent variables.
b. The model only chooses higher values of k the dataset is large.
c. The choice of k is made by the researcher alone and not the software.
d. Higher values of k provide smoothing that reduces the risk of overfitting due to noise in the training data.

 

 

____  23.

 

The diagram above represents which data mining technique?

a. K-nearest-neighbor
b. Regression tree
c. Naive Bayes
d. Logit

 

 

____  24.

The above diagram represents what data mining classification scheme?

a. K-nearest-neighbor
b. Regression tree
c. Naive Bayes
d. Logit

 

 

____  25.

The information above was provided for an email that was classified as spam. What data mining technique was probably used to make the classification?

a. K-nearest-neighbor
b. Regression tree
c. Naive Bayes
d. Logit

 

 

____  26.

 

Which data mining technique (represented above) uses a quadratic classifier?

a. K-nearest-neighbor
b. Regression tree
c. Naive Bayes
d. Logit

 

 

____  27.

What data mining technique is represented in the diagram of a classification scheme above?

a. K-nearest-neighbor
b. Regression tree
c. Naive Bayes
d. Logit

 

 

____  28.

The misclassification rate in the confusion matrix above is

a. 0 percent.
b. 10 percent.
c. 9 percent.
d. 19 percent.
e. None of the above are correct.

 

 

____  29.

The Universal Bank data represented above has been partitioned with what percentages?

a. 50%, 30%, 20% in training, validation, and test sets
b. 60%, 40% in training and validation sets
c. 60%, 20%, 20% in training, validation, and test sets
d. 50%, 20%, 30% in training, validation, and test sets
e. None of the above are correct.

 

 

____  30.   In data mining the model should be applied to a data set that was not used in the estimation process in order to find out the accuracy on unseen data; that “unseen” data set is called

 

a. the training data set.
b. the validation data set.
c. the test data set.
d. the holdout data set.
e. None of the above are correct.

 

 

____  31.   In data mining the term “binning” refers to

a. a Naive Bayes classification system.
b. ranking the data.
c. transforming data into a categorical variable.
d. grouping data into classes.
e. None of the above is correct.

 

 

____  32.   In the K-Nearest-Neighbor technique in data mining, the “K” refers to

a. the originator of the technique, Jonathan Knowlton.
b. the number of classifiers used.
c. the number of classes into which the variable may be divided.
d. the weight of the dependent variable.
e. None of the above is correct.

 

 

____  33.

The data mining technique represented above is probably

a. a k-nearest-neighbor model.
b. a naive Bayes model.
c. a regression tree.
d. a logistic regression.

 

 

____  34.

In setting up this k-nearest-neighbor model

a. the user is allowing XLMiner™ to select the optimal value of k.
b. the optimal k is set by the user at 10.
c. the data is normalized in order to take into account the categorical variables.
d. it is necessary to set an optimal value for k.

 

 

____  35.

In the k-nearest-neighbor model represented above what is the error rate represented?

a. about 3 percent.
b. about 5 percent.
c. about 7 percent.
d. more than 10 percent.

 

 

____  36.

 

The lift chart above shows that the data mining classification model

a. is working well in classifying unseen data.
b. is working well in classifying training data.
c. is working quite poorly.
d. is doing no better at classifying than a naive model.

 

 

____  37.   The diagram below depicts the probability that a person takes out a loan given their level of income. The function shown is

 

 

a. an ordinary least squares model (OLS).
b. a linear probability model (LPM).
c. the odds function.
d. a logit.

 

 

____  38.   Consider the equation below.

This equation is the basis of

 

 

a. the logit model.
b. the naive Bayes Model.
c. the k-nearest neighbor model.
d. classification tree models.

 

 

____  39.   “Pruning” is used in what data mining model?

a. Naive Bayes
b. Logit
c. K-Nearest Neighbor
d. Regression Trees

 

 

____  40.   “Pruning” is used

a. to overcome correlation among the independent variables.
b. only when the independent variables are dichotomous.
c. to prevent the model from overfitting the data.
d. as a “data utility” in order to create a validation set.

 

 

____  41.   With most data mining techniques we “partition” the data

a. into “success” and “failure” results in order to create a dependent variable that is a dummy variable.
b. only when we require a confusion matrix to be created.
c. after estimating the appropriate technique.
d. in order to judge how our model will do when we apply it to new data.

 

 

____  42.   “Entropy” measures are used in which data mining technique?

a. Logit
b. Classification Trees
c. Naive Bayes
d. K-Nearest Neighbor
e. Neural Networks

 

 

____  43.   “Information Gain” and “Entropy”

a. are used in Classification Trees to determine when to stop the algorithm.
b. are two components of Bayes Theorem.
c. are related ways of categorizing risk.
d. are unrelated.

 

 

____  44.

 

If I choose to classify Insects as either Katydids or Grasshoppers by examining the distribution of the lengths of the antennas of a sample of the two insects (as shown below), this would be the beginning analysis of what data mining tool?

 

 

a. Naive Bayes
b. Logit
c. Regression Tree
d. K-Nearest Neighbor

 

 

____  45.   What data mining technique is being depicted below?

 

a. K-Nearest Neighbor
b. Naive Bayes
c. Decision Tree
d. Logit
e. Neural Net

 

 

____  46.   Consider the following Lift Chart. Cumulative percentage of hits is the Y-axis variable. Percent of the entire list is the X-axis variable.

 

What is the “Lift” at 5%?

 

 

a. exactly 4
b. about 5
c. exactly 20
d. about 25
e. unable to determine from information given.

 

 

____  47.   Consider the printout below:

What is the “Misclassification Rate?”

 

a. 0
b. 3
c. 50
d. 30
e. It is not shown in this printout.

 

 

____  48.

Examine the Naive Bayes output above that describes the Titanic survival model.

What is the probability of survival if you are a crew member, male, and adult?

 

a. 0.613324957
b. 0.001352846
c. 0.442673445
d. 0.046373782

 

 

____  49.   The “logit” is

a. a linear function with a Z distribution.
b. can be an attribute in a logistics regression.
c. the natural log of an odds ratio.
d. the conditional probability that the success rate is greater than the cutoff value.

 

 

____  50.

The diagram above represents

 

a. the locus of all points that could cause the success rate to be above 50 percent.
b. a logistics regression output from XLMiner.
c. the Naive Bayes classifier as being between zero and one.
d. a graph of the possible values of the logit in a logistics regression.

 

 

____  51.   In logistics regression data mining, P/(1-P) represents

 

a. the logit.
b. the log likelihood of success.
c. the odds of success.
d. the cutoff value.

 

 

____  52.

The regression line shown above was estimated using an ordinary least squares regression technique. This regression is inappropriate to use on this data because

 

a. the attribute measured here is dichotomous.
b. there is no apparent relationship between hours of study and outcome.
c. there is only a single attribute in the model.
d. the target variable is categorical.

 

 

____  53.   Among the advantages to using the Naive Bayes model is

 

a. it is quite sensitive to irrelevant features.
b. it is fast at classification.
c. it can be used in situations in which the target variable is continuous.
d. All of the above are advantages.

 

 

____  54.   Naive Bayes is called “Naive” because

 

a. very few attributes are needed to obtain accurate classifications.
b. the model assumes that only continuous variables can be used as attributes.
c. it tends to be used only as a “baseline” model in order to measure the effectiveness of other data mining techniques.
d. the attributes are assumed to be independent o one another.

 

 

____  55.   In a Naive Bayes model it is necessary

 

a. that all attributes be categorical.
b. to partition the data into three parts (training, validation, and scoring).
c. to set cutoff values to less than 0.75.
d. to have a continuous target variable.

 

 

____  56.   Naive Bayes models

 

a. use a linear classifier.
b. use a nonlinear classifier.
c. use a waveform classifier.
d. use a logit as a classifier.

 

 

____  57.   Which classification technique that we covered assumed that the attributes had independent distributions?

a. k-Nearest Neighbor
b. Classification trees
c. Naive Bayes
d. Logistics Regression

 

 

____  58.   Our confidence that X is an apple given that we have seen X is red and round

 

a. is a coincident probability.
b. could lead us to misclassify similar objects.
c. is a prior probability.
d. is a posterior probability.

 

 

____  59.

What data mining technique is demonstrated here?

 

a. k-Nearest Neighbor
b. Classification Tree
c. Naive Bayes
d. Logistic Regression

 

 

____  60.

The table above is part of the output from a data mining algorithm seeking to predict whether an individual will take out a personal loan given a set of attributes. What data mining technique is probably being used here?

 

a. k-Nearest Neighbor
b. Classification Tree
c. Naive Bayes
d. Regression Tree

 

 

____  61.

 

The above is a prune log for a data mining technique. What technique would have this type of output?

 

a. k-Nearest Neighbor
b. Classification Tree
c. Naive Bayes
d. Logistic Regression

 

 

____  62.

 

The above table is a decile wise lift chart. The first bar on the left indicates

 

a. that our attribute or attributes did little to explain predicted success in this model.
b.  that the lift will not vary with the number of cases we consider.
c. that taking the 10% of the records that are ranked by the model as the most probable 1’s” yields about as much as a naive model.
d. that taking the 10% of the records that are ranked by the model as the most probable 1’s” yields twice as many 1’s as would a random selection of 10% of the records.

 

 

____  63.

Which attribute above provides the greatest reduction in entropy?

 

a. Hair Length
b. Weight
c. Age
d. The above are not reasonable entropy measures; rather they show information gain.

 

 

____  64.   When a collection of objects is completely uniform,

 

a. entropy is at a maximum.
b. entropy is at a minimum.
c. entropy would be about .5.
d. Uniformity has nothing to do with entropy.

 

 

____  65.   Suppose that a data mining routine has an adjustable cutoff (threshold) mechanism by which you can alter the proportion of records classified as owner. Three cases are described below.

 

Describe how moving the cutoff up or down from a starting point of 0.5 affects the misclassification error rate.

 

 

 

 

a. The misclassification error rate dropped as the threshold dropped.
b. The misclassification error rate dropped as the threshold increased.
c. The misclassification error rate remained unchanged as the threshold changed.
d. The misclassification error rate changed when the threshold either increased or decreased.