Methodology

1. Dual-track architecture

Rather than blending clinical and community data into a single number (which hides the evidence hierarchy), we compute two parallel estimates per side effect:

Clinical

Only published clinical trials + regulatory reports. Conservative. All modifiers from peer-reviewed sources.

Real world

All sources including Reddit, forums, news. Captures selection bias but also real experiences missing from trials.

The gap between them is itself informative. For example: clinical trials systematically underreport emotional blunting, hair loss, and fatigue. When the real-world estimate is significantly higher, this indicates a gap in clinical reporting — not error in the prediction. As more clinical data arrives, both tracks should converge.

2. Data collection

Daily, 18 scrapers collect data from four source categories. Claude Haiku extracts structured data points with a confidence level (high/medium/low):

• Clinical: PubMed, ClinicalTrials.gov, Cochrane, WHO, EMA, MHRA, preprints
• Regulatory: FDA FAERS, EMA, MHRA
• User reports: Reddit (13 subreddits), Trustpilot, Drugs.com, Quora
• News & guidelines: Google News RSS, professional guidelines

Each data point retains its sourceType for deduplication (sourceUrl + sideEffect) and for filtering into the two tracks. Extraction is conservative: only explicitly stated rates, no inference.

3. Statistical approach

For each track:

Filter data points by patient profile (sex, dose, ethnicity, exercise)
Weighted average (weight = sample size × extraction confidence)
Winsorization at 5th/95th percentile if n > 10 (bounds logged explicitly)
Dose scaling if data lacks dose specificity
Log-odds modifier application (capped at total shift of 2.5 log-odds)
Random-effects confidence interval (DerSimonian-Laird τ²)

logit(p) = logit(base_rate) + Σ ln(OR_i) (|Σ| ≤ 2.5)

4. Self-evolving model

A daily statistical analysis computes empirical odds ratios for every dimension × effect. Key safeguards:

• FDR correction (Benjamini-Hochberg) across all hypothesis tests
• Auto-update only when n ≥ 30, corrected p < 0.01, and OR change < 0.3
• Larger changes are flagged for human review
• New parameters (ethnicity, BMI, diet) get 'promoted' when significant for 2+ effects
• Max 5 auto-updates per day; all changes logged with provenance
• Every previous config version is retained for rollback

5. Honest limitations

We publish our limitations because hidden weaknesses are more dangerous than visible ones:

Data volume: Currently ~217 data points across 15 effects. Model health is 'degraded' until we reach n ≥ 100 per effect.
Demographic gaps: Female bias (87% of specified sexes), minimal ethnic diversity in sources.
Modifier sources: Initial values are hand-coded from literature. Being replaced with empirical values as data accumulates.
No interaction terms: Modifiers are applied independently. Interactions (e.g., sex × age) are not modeled — we cap cumulative log-odds shifts to limit stacking bias.
Calibration testing: Not yet formally validated against independent outcome data. Planned once n ≥ 500 per effect.
LLM extraction: Imperfect. Gold-standard manual audit is planned on a sample of 50 sources per effect.
Journey predictor: Weight trajectory and muscle preservation models use STEP trial constants with expert-coded modifiers. Not empirically validated.
Not causal: These are population-average conditional risks, not individual causal predictions.

6. Publication roadmap

• Phase 1 (current): dual-track framework, open methodology, community feedback
• Phase 2: n ≥ 100 per effect, formal calibration testing
• Phase 3: n ≥ 500, external validation on independent dataset
• Phase 4: pre-register at OSF.io, submit to peer-reviewed journal (target: Nature Medicine)

We are a small team and our methodology inevitably has weaknesses. If you see something wrong or improvable, let us know.

Magistra Side Effects Predictor™ — Statistical indicator, not medical advice.

Invitation to researchers

1. Dual-track architecture

2. Data collection

3. Statistical approach

4. Self-evolving model

5. Honest limitations

6. Publication roadmap

Researcher feedback

Open source on GitHub

Cite our preprint