Medical Data Analysis
Analyzed patient and lab data to identify key indicators influencing disease onset, then built and tuned models to highlight the most impactful factors.
Context
In this medical project, I collaborated closely with pharmacists to improve a diagnostic test and determine which patient-related factors (e.g., age, physical activity, smoking habits) or clinical indicators (e.g., blood group, hormone levels) most strongly influence the onset of a specific medical condition. By combining clinical insights with advanced modeling, we aimed to build interpretable, data-driven tools for better risk assessment.
Approach
- Data: Compiled a dataset combining patient demographics (age, lifestyle) with medical test results (blood markers, hormone levels). Since the real data is confidential, I generated a synthetic dataset with zero correlation—allowing anyone to run the code without revealing sensitive information.
- Methods: Constructed and optimized multiple models using feature importance techniques (e.g., ensemble models, interpretability tools like SHAP or permutation importance) to identify which variables most strongly influence disease prediction.
- Management: Structured the workflow into stages: data processing → model development → feature importance evaluation, ensuring a clear, iterative process of analysis and validation.
Results
The study revealed the most influential predictors of disease risk, providing actionable insights into which patient and laboratory parameters to monitor most closely. This enhanced medical test performance and offered interpretable guidance for clinical decision-making.
