CFA·CFA-L2 · CFA Level II·UnitCFA-L2 · Unit 02Access: Premium
Quantitative Methods
Prepare for Quantitative Methods with CFA practice questions covering 8 topics. Part of CFA Level II — build your knowledge and track your progress with PopCFA.
What’s in it.
8 topics- Topic 01
Multiple Regression: Advanced Topics
27 questions - Topic 02
Time-Series Analysis
27 questions - Topic 03
Machine Learning Methods
27 questions - Topic 04
Big Data and Investment Applications
30 questions - Topic 05
Simulation Methods
24 questions - Topic 06
Panel Data Regression
24 questions - Topic 07
Logistic Regression and Classification
27 questions - Topic 08
Backtesting and Model Evaluation
33 questions
Sample questions
3 of manyA few questions from this unit, with the answer and a full explanation. The complete bank is available when you start practising.
How do word embeddings such as Word2Vec differ from TF-IDF in representing words for natural language processing?
- Word embeddings represent words as dense vectors that capture semantic similarity; TF-IDF is a statistical weighting scheme reflecting term importance within and across documents but ignoring semantic relationshipsCorrect answer
- Word embeddings and TF-IDF are functionally equivalent for financial text because financial documents use a specialised vocabulary where all terms are equally weighted
- Word embeddings assign a numerical score from 0 to 1 to each word based on its domain-specific importance; TF-IDF produces multi-dimensional vectors that encode positional information
- Word embeddings weight terms by their frequency in a document relative to their frequency across all documents; TF-IDF requires neural network training on large corpora to produce vector representations
ExplanationTF-IDF (Term Frequency–Inverse Document Frequency) is a statistical measure that assigns a numerical weight to each word in a document based on how often it appears in that document relative to how often it appears across the entire corpus. It does not capture semantic relationships between words. Word embeddings (e.g., Word2Vec, GloVe) are dense, low-dimensional vector representations trained on large corpora; contextually similar words have similar vectors, enabling semantic relationships to be captured mathematically.
A researcher has a macroeconomic time series that appears to trend upward visually but the ADF test with a constant and trend specification fails to reject the unit root null (p = 0.12). An alternative ADF test on the first differences rejects the null of a unit root in the differences (p = 0.01). Based on these results, should the researcher detrend or first-difference the original series, and why?
- The ADF test results are contradictory and no transformation should be applied until further tests, such as the KPSS test, confirm the order of integration
- The researcher should detrend and then first-difference in sequence: detrending removes the visible linear trend while first differencing removes any remaining stochastic non-stationarity identified by the ADF test
- The researcher should first-difference: the ADF results indicate the series is I(1) (unit root in levels but not in first differences); a stochastic trend is present, and first differencing removes it; detrending would be incorrect because detrending only removes deterministic trends, and the series has a stochastic trendCorrect answer
- The researcher should apply a log transformation rather than first differencing; log transformation stabilises an exponential trend and renders the series stationary without requiring differencing
ExplanationThe ADF evidence is consistent with an I(1) process: the level series has a unit root (fail to reject H₀) and the first-differenced series is stationary (reject H₀). The visual upward trend is explained by the stochastic trend's drift component (random walk with drift), not a deterministic trend. For an I(1) series, the correct stationarity transformation is first differencing, which removes the stochastic trend. Detrending an I(1) series does not produce a stationary series — the ADF test on the detrended residuals would still fail to reject the unit root null. First differencing is unambiguously correct here.
In the context of alternative data, what does 'point-in-time data alignment' mean, and why is it essential for backtesting?
- Point-in-time alignment normalises alternative data to a common unit of measurement to allow direct comparison across different data vendors and geographies
- Point-in-time alignment synchronises all alternative data feeds to a single calendar date, eliminating differences in reporting frequency across data vendors
- Point-in-time alignment ensures data is used in a backtest only as of the date it was actually available to investors, preventing the use of information not yet known at the time of the simulated decisionCorrect answer
- Point-in-time alignment adjusts historical alternative data for survivorship bias by re-including companies that subsequently failed into the historical dataset
ExplanationPoint-in-time data refers to data that is stored as it existed at each historical moment, including subsequent revisions. When constructing backtests, using data that was not actually available until after the simulated trade date constitutes look-ahead bias, which artificially inflates strategy performance. For example, if a credit card dataset is retroactively revised by the vendor and the backtest uses the revised data for periods when only the original data was available, the model 'knows' outcomes that were unknowable at the time. Proper point-in-time alignment stores each data observation with its first-availability timestamp.