Home / CFA-L2 · CFA Level II / Quantitative Methods

CFA·CFA-L2 · CFA Level II·UnitCFA-L2 · Unit 02Access: Premium

Quantitative Methods

Prepare for Quantitative Methods with CFA practice questions covering 8 topics. Part of CFA Level II — build your knowledge and track your progress with PopCFA.

Questions
219
Topics
8
Access
Premium

What’s in it.

8 topics
  • Topic 01

    Multiple Regression: Advanced Topics

    27 questions
  • Topic 02

    Time-Series Analysis

    27 questions
  • Topic 03

    Machine Learning Methods

    27 questions
  • Topic 04

    Big Data and Investment Applications

    30 questions
  • Topic 05

    Simulation Methods

    24 questions
  • Topic 06

    Panel Data Regression

    24 questions
  • Topic 07

    Logistic Regression and Classification

    27 questions
  • Topic 08

    Backtesting and Model Evaluation

    33 questions

Sample questions

3 of many

A few questions from this unit, with the answer and a full explanation. The complete bank is available when you start practising.

  1. How do word embeddings such as Word2Vec differ from TF-IDF in representing words for natural language processing?

    • Word embeddings represent words as dense vectors that capture semantic similarity; TF-IDF is a statistical weighting scheme reflecting term importance within and across documents but ignoring semantic relationships
      Correct answer
    • Word embeddings and TF-IDF are functionally equivalent for financial text because financial documents use a specialised vocabulary where all terms are equally weighted
    • Word embeddings assign a numerical score from 0 to 1 to each word based on its domain-specific importance; TF-IDF produces multi-dimensional vectors that encode positional information
    • Word embeddings weight terms by their frequency in a document relative to their frequency across all documents; TF-IDF requires neural network training on large corpora to produce vector representations
    Explanation

    TF-IDF (Term Frequency–Inverse Document Frequency) is a statistical measure that assigns a numerical weight to each word in a document based on how often it appears in that document relative to how often it appears across the entire corpus. It does not capture semantic relationships between words. Word embeddings (e.g., Word2Vec, GloVe) are dense, low-dimensional vector representations trained on large corpora; contextually similar words have similar vectors, enabling semantic relationships to be captured mathematically.

  2. A researcher has a macroeconomic time series that appears to trend upward visually but the ADF test with a constant and trend specification fails to reject the unit root null (p = 0.12). An alternative ADF test on the first differences rejects the null of a unit root in the differences (p = 0.01). Based on these results, should the researcher detrend or first-difference the original series, and why?

    • The ADF test results are contradictory and no transformation should be applied until further tests, such as the KPSS test, confirm the order of integration
    • The researcher should detrend and then first-difference in sequence: detrending removes the visible linear trend while first differencing removes any remaining stochastic non-stationarity identified by the ADF test
    • The researcher should first-difference: the ADF results indicate the series is I(1) (unit root in levels but not in first differences); a stochastic trend is present, and first differencing removes it; detrending would be incorrect because detrending only removes deterministic trends, and the series has a stochastic trend
      Correct answer
    • The researcher should apply a log transformation rather than first differencing; log transformation stabilises an exponential trend and renders the series stationary without requiring differencing
    Explanation

    The ADF evidence is consistent with an I(1) process: the level series has a unit root (fail to reject H₀) and the first-differenced series is stationary (reject H₀). The visual upward trend is explained by the stochastic trend's drift component (random walk with drift), not a deterministic trend. For an I(1) series, the correct stationarity transformation is first differencing, which removes the stochastic trend. Detrending an I(1) series does not produce a stationary series — the ADF test on the detrended residuals would still fail to reject the unit root null. First differencing is unambiguously correct here.

  3. In the context of alternative data, what does 'point-in-time data alignment' mean, and why is it essential for backtesting?

    • Point-in-time alignment normalises alternative data to a common unit of measurement to allow direct comparison across different data vendors and geographies
    • Point-in-time alignment synchronises all alternative data feeds to a single calendar date, eliminating differences in reporting frequency across data vendors
    • Point-in-time alignment ensures data is used in a backtest only as of the date it was actually available to investors, preventing the use of information not yet known at the time of the simulated decision
      Correct answer
    • Point-in-time alignment adjusts historical alternative data for survivorship bias by re-including companies that subsequently failed into the historical dataset
    Explanation

    Point-in-time data refers to data that is stored as it existed at each historical moment, including subsequent revisions. When constructing backtests, using data that was not actually available until after the simulated trade date constitutes look-ahead bias, which artificially inflates strategy performance. For example, if a credit card dataset is retroactively revised by the vendor and the backtest uses the revised data for periods when only the original data was available, the model 'knows' outcomes that were unknowable at the time. Proper point-in-time alignment stores each data observation with its first-availability timestamp.