Understanding stroke with Bayesian networks

Robert O’Shea


Background: Stroke is a major source of morbidity worldwide, causing 5.78 million deaths per annum as per WHO global health estimates. An international effort is underway to improve outcomes in stroke by means of secondary and tertiary preventative measures. To maximise the efficacy of such interventions, we must fully understand the processes which lead to stroke-related morbidity and mortality. We propose to reframe stroke as a component of a network system, with multiple interacting causes and consequences. In real-world epidemiology, interactive systems are known to exist between social, behavioural and biological risk factors. The network paradigm accommodates such complexity well, and has demonstrated value in genetics, pathology and therapeutics. We propose Bayesian network inference as a hypothesis-free method of characterising the causal processes of stroke outcomes.
Methods: We examine data recorded during the International Stroke trial, a multi-centre interventional trial evaluating the efficacy of anticoagulation and antiplatelet agents as secondary preventative agents in 19,000 cases of stroke. We extract 38 relevant variables, pertaining to patient demographics, stroke presentation, clinical features, diagnosis, management and outcomes. A discrete Bayesian network inferred by optimisation of network score. The performance of several network scores and search algorithms were compared using cross validation. This process identified TABU with K2 score as the optimal network search protocol. Bayesian Network bootstrapping was used to provide an estimate of network structural confidence.
Results: Bayesian network inference detected 119 significant conditional dependencies in the International Stroke Trial dataset. These conditional dependencies were consistent with known clinical associations. 14-day mortality was found to be conditionally dependent on age at presentation (Mutual Info: P value <2e-16) and major non-cerebral haemorrhage (Mutual Info: P value <2e-16). 6-month outcome was affected by age (Mutual Info: P value <2e-16), conscious level at presentation (Mutual Info: P value <2e-16), presence of a lower limb deficit (Mutual Info: P value <2e-16) and hemianopia on examination (Mutual Info: P value <2e-16). 6-month outcomes were affected by recurrence of ischaemic stroke (Mutual Info: P value <2e-16), haemorrhagic stroke (Mutual Info: P value <2e-16), and stroke of unknown origin (Mutual Info: P value <2e-16). 6-month outcomes were also conditionally dependent on discharge within 14 days (Mutual Info: P value <2e-16).
Conclusions: We organise the pathogenesis, management and sequelae as a single functional system, in which clinical phenomena are understood to influence one another. We demonstrate the utility of the method to form and test multiple hypotheses in an objective fashion. This methodology is general and may theoretically be applied to various observational datasets across the health sciences.