Conquering confounds and covariates in machine learning

As I tried to study impact of different deconfounding methods, as well as offer covariate regression ability in neuropredict, I realized the tools and methods I need to implement would be useful to broader machine learning and neuroscience community. Hence, I set aside a couple of weeks to review the relevant literature (esp. in the context of biomarkers and predictive modeling), which convinced me that there are still many open questions! For example, there is no consensus on 1) what really constitutes a confound?, 2) when should we try to defoncound it? and 2) how do we properly assess their impact? etc. That really convinced me even further we need a common, open and dependable library to conquer confounds in machine learning. So, I’d like to quickly announce the initial beta release of the beautiful python library called confounds.

Vision / Goals

The high-level and long-term goal of this package is to develop high-quality library to conquer confounds and covariates in ML applications.

By conquering, we mean methods and tools to

  • visualize and establish the presence of confounds (e.g. quantifying confound-to-target relationships),
  • offer solutions to handle them appropriately via correction or removal etc, and
  • analyze the effect of the deconfounding methods in the processed data (e.g. ability to check if they worked at all, or if they introduced new or unwanted biases etc).

More docs with usage examples are available in the repo.

Fuller documentation with API reference will be released shortly.

Contributors are most welcome!

