Skip to content

Searching for "Similar Markets" to Build Portfolios — Reproducing a Search-Based Asset Allocation Using Market Structure Similarity

For / Key Points

Audience: Investors interested in diversification and portfolio theory who intuitively believe that improving risk estimation should improve allocation. No math required — we cover the method's structure and verification results.

Key Points:

  • A method was proposed that searches past "similar market environments" to synthesize covariance matrices and improve portfolio allocation
  • We independently reproduced the results and confirmed that Risk Parity (RP) portfolios nearly match the paper's reported values
  • However, when evaluated with transaction costs, performance equals or falls below the simplest estimation method (Ledoit-Wolf), quantitatively demonstrating that better estimation does not automatically mean better returns
  • Search for "similar markets" Synthesize covariance matrices to improve portfolio allocation
  • Reproduction matches the paper RP gap of just 0.004, MVP converges to 0.010
  • After costs, no better than LW Better estimation ≠ better returns, confirmed quantitatively

What Does This Method Do?

In March 2020, the COVID crash reshuffled inter-industry correlations overnight. Investors who built portfolios from recent data rode into the drawdown assuming "normal diversification" still held. Similar regime breaks had happened before — but there was no systematic way to ask "which past period looked like today?"

Hoshino (2026), in a preprint1, proposes exactly that: searching through history for periods with "similar market structure" and blending their covariance matrices to build better portfolios.

The bottom line first: the method reproduces the paper's results almost exactly. But after transaction costs, it performs no better than the simplest estimation method (Ledoit-Wolf). Below, we trace why.

Why Use "Similar Periods"?

When building a portfolio, you need to estimate how much assets move together — the "covariance matrix." This matrix changes dramatically by market regime. Correlation patterns during the Lehman crisis look nothing like those in a calm bull market.

The naive approach estimates from the most recent 250 days, but with 49 industries that means 1,225 parameters — too many for the data, producing noisy estimates. Averaging over 20 years reduces noise but introduces bias from irrelevant regimes.

This method targets the middle ground. It finds the 10 most similar past market structures and synthesizes their covariance matrices via maximum likelihood estimation. By selecting only similar regimes, it should have less bias than the full-period average and less noise than the 250-day window — at least in theory.

So how does it define "similar"? This is the core of the method.

How Is "Similarity" Measured?

Four types of features are used to measure market structure similarity, falling into two categories: network-derived and matrix-derived.

Network-based (graph constructed from inter-industry correlations)

  1. Fiedler vector: Extracts graph structure features. Captures whether the market tends to split into two groups
  2. Closeness centrality: Quantifies how "close" each industry is to others on the network. Represents the distribution of market "connectivity"

Matrix-based (extracted directly from the covariance matrix)

  1. Leading eigenvector: The dominant direction of variation in the covariance matrix. Shows where overall market risk is concentrated
  2. Eigenvalue distribution: Whether risk concentrates in a few factors or is dispersed

Historical time points where these features are closest to "now" are identified, and their covariance matrices are used to build the portfolio. How much improvement did the paper achieve?

Paper Results

The evaluation uses Fama-French 49 Industry Portfolios2 (US, from 1926) over January 2006 to December 2025. Benchmarks are the sample covariance and Ledoit-Wolf shrinkage estimation3. The metric is the risk-return ratio (annualized return ÷ annualized risk) — higher means better risk-adjusted returns.

Minimum Variance Portfolio (MVP)

MethodAnn. ReturnAnn. RiskRisk-Return Ratio
Sample covariance9.64%14.18%0.680
Ledoit-Wolf9.69%14.18%0.684
Proposed (Fiedler)10.59%14.22%0.744

Risk Parity Portfolio (RP)

MethodAnn. ReturnAnn. RiskRisk-Return Ratio
Sample covariance10.43%18.82%0.554
Ledoit-Wolf10.43%18.83%0.554
Proposed (Closeness+AIRM)10.58%18.85%0.561

MVP's risk-return ratio rises from 0.680 to 0.744 — a relative improvement of about 9% ((0.744−0.680)÷0.680). RP shows only about 1% improvement. The RP margin is notably thin.

These margins can easily collapse with small implementation or evaluation differences. So we attempted an independent reproduction under matched conditions.

Our Reproduction

Setup

ItemPaperOur Test
DataFF49 Industry PortfoliosSame (public data)
Evaluation period2006-01 to 2025-12Same
AlgorithmDuckDB VSS (HNSW)Exact brute-force surrogate
AnnualizationUnspecified (consistent with CAGR)CAGR
PortfoliosMVP / RPSame

A "faithful reproduction" with matching data and period. Only the search engine differs (paper uses approximate nearest neighbor; we use exact search).

Benchmark Reproduction (Phase 0)

MethodMeasuredPaperGapVerdict
Sample MVP0.6500.680-0.031Directionally consistent
LW MVP0.6620.684-0.021Directionally consistent
Sample RP0.5530.554-0.001Near-exact match
LW RP0.5590.554+0.006Near-exact match

RP benchmarks nearly perfectly match the paper. MVP is slightly lower but directionally consistent.

Main Results (Phase 1)

MethodMeasuredPaperGapVerdict
Fiedler MVP0.6970.744 (ANN) / 0.707 (AIRM)-0.010 (vs AIRM)Near match
Closeness+AIRM RP0.5570.561-0.004Near match

RP differs by just 0.004 from the paper. MVP converges to within 0.010 of the re-ranking variant (AIRM).

Implementation Pitfalls Found During Reproduction

Two issues emerged that significantly affected results.

First: the annualization method. The paper doesn't specify its formula, but using CAGR (geometric mean annual return) instead of arithmetic mean (daily mean × 252) aligns with the paper's tables. With arithmetic mean, the RP risk-return ratio becomes 0.62 — far from the paper's 0.55. Annualized risk matches while return inflates, which initially made it look like a methodological problem.

Second: Fiedler vector sign ambiguity. The Fiedler vector is an eigenvector, so mathematically v and -v are both valid solutions. When comparing adjacent months' Fiedler vectors, signs flip 42% of the time. When using Euclidean distance to find "similar periods," a flipped sign makes the most similar period appear maximally distant. This fix alone raised the risk-return ratio from 0.667 to 0.697.

The Real Question — Is It Usable?

The reproduction went well. But in investing, the real question isn't "can we reproduce the paper" — it's "can we use this in practice?"

We compared the proposed method against Ledoit-Wolf (LW) as baseline, evaluating cost-adjusted performance.

MVP: Proposed Method vs LW

MetricLedoit-WolfProposed (Fiedler)Difference
Risk-return ratio0.6620.697+0.035
Same (10bp costs)0.6410.655+0.014
Same (20bp costs)0.6190.613-0.006
Max drawdown40.71%37.84%-2.87pp
Annual turnover1.412.86+1.45

At zero cost, +0.035 improvement. At 10bp (0.1% one-way), +0.014 still survives. Max drawdown improves by 2.87 points.

But at 20bp, it reverses. And turnover doubles. Monthly rebalancing produces large weight changes, making costs heavy.

RP: Proposed Method vs LW

MetricLedoit-WolfProposed (Closeness+AIRM)Difference
Risk-return ratio0.5590.557-0.002
Same (10bp costs)0.5570.555-0.002
Max drawdown54.65%54.10%-0.55pp
Annual turnover0.170.21+0.04

For RP, the proposed method underperforms LW even before costs. The improvement is not just zero — it's negative.

Verdict

PortfolioDecisionReason
RP (Risk Parity)RejectedEqual or worse than LW. Complexity not justified
MVP (Min Variance)On holdAdvantage up to 10bp, MDD improvement. But reverses at 20bp, double turnover

RP succeeded as a reproduction but proved unnecessary as an operational tool — LW suffices. MVP shows conditional promise, but full adoption carries turnover risk. We position it as an experimental "optional feature."

Why Doesn't "Better Estimation" Mean "Better Performance"?

This result may seem counterintuitive. If covariance estimation improves, shouldn't portfolios improve too?

Two reasons.

First, LW is already remarkably good. Ledoit-Wolf shrinkage is a single function call, runs instantly, needs no parameter tuning, yet reliably improves on sample covariance. The search-based method must beat this baseline to be worthwhile — and for RP, it couldn't.

Second, estimation improvement gets offset by costs and turnover. Better covariance estimation moves portfolio weights in the "correct direction." But moving weights means rebalancing costs. When the estimation improvement is small, the incremental rebalancing cost consumes it entirely.

Comparing with a different quant strategy — the US-Japan sector lead-lag strategy — reveals the pattern:

Lead-lag strategySearch-based allocation
Edge sourceTime-zone information lagCovariance estimation improvement
Edge magnitude26% annualized (pre-cost)Risk-return ratio +0.035 (pre-cost)
Execution frequencyDailyMonthly
Cost sensitivityExtremely highModerate
Cause of deathThin margins × daily costsThin improvement × turnover

Both follow the pattern of "the map is correct, but operating costs make it unprofitable." But they die differently. Lead-lag gets killed by daily transaction costs. Search-based allocation can't clear the hurdle of LW as a strong baseline.

Lessons from This Verification

Don't underestimate baseline strength. Ledoit-Wolf is a one-liner but an extremely strong estimator. When evaluating new methods, "better than sample covariance" is trivial — "better than LW" is the real bar.

Reproduction and operational judgment are separate questions. Our reproduction closely matched the paper's values. But "reproducible" and "usable" are different. RP succeeded in reproduction but was rejected operationally. MVP's reproduction was partial, yet it remains a conditional candidate. Reproduction accuracy and adoption decisions don't necessarily correlate.

Precisely identify the source of edge. This method's edge comes from "improved risk estimation," not "market prediction." Risk estimation improvements structurally generate smaller profits than return predictions. Whether that thin profit survives as a cost-adjusted margin over the baseline determines the final adoption decision.

In investing, "theoretically correct improvement" and "practically surviving improvement" are different things. Measuring that gap quantitatively is as important in quantitative investing as the theory itself. But one more thing — the experience of stepping on implementation pitfalls through reproduction sharpens judgment when evaluating the next method. This hands-on experiential knowledge, unavailable from reading papers alone, may be the most reproducible asset in quantitative investing.


  1. Hoshino (2026). Search-based asset allocation using market structure similarity. Preprint. 

  2. Kenneth R. French, Fama-French 49 Industry Portfolios. Dartmouth College. 

  3. Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88(2), 365–411.