Research
My doctoral thesis is entitled "End-to-end approach to classification in unstructured spaces with application to judicial decisions" and focused both on theoretical and practical Machine Learning. I try to reduce the need for expertise required in the usual Machine Learning workflow as it is the first obstacle to the adoption of artificial intelligence solutions.
A detailed summary of my doctoral thesis is available here.
My main contributions are:
- A new mathematical theory for classification, with "good" properties (explainability, no metric required, no hyperparameter,...) based on hypergraphs and metric learning,
- A generic method to automate most of data preparation using standard hyperparameter tuning techniques,
- The largest curated datasets about the legal domain, on which I reached over 94% accuracy predicting the outcome of a judgment.
My current research interests span different areas of Machine Learning and Artificial Intelligence:
- Stochastic Sequence Hypergraphs for classification,
- Explainability of Machine Learning models,
- AutoML & Automated data preparation,
- Application of AI to the justice domain.
Previously I also worked on:
- Online hyperparameter tuning,
- Multiobjective discrete optimization.
I serve or served as reviewers for the following journals and conferences:
- Fuzzy Information and Engineering, Taylor & Francis
- Data & Knowledge Engineering, Elsevier
- Computing, Springer
- Expert Systems With Applications, Elsevier [Certificate]
- Data Analytics solutions for Real-Life APlications (DARLI-AP), workshop collocated with EDBT/ICDT Joint Conference
Thesis
- End-to-end approach to classification in unstructured spaces with application to judicial decisions Ph.D. thesis
- Insertion of adaptive modalities in the mono or multi objectives evolutionary planner Divide-and-Evolve Master thesis
Publications
- True Pareto Fronts for Realistic Multi-Objective AI Planning Instances To be submitted to International Conference on Automated Planning and Scheduling (ICAPS), 2021
- A Physical Approach to Classification To be submitted to International Conference on Machine Learning (ICML), 2021
- Cautiously Making Friends with AI: Machine Learning for human rights research and practice AI & Human Rights: Friend or Foe?, The Erasmus School of Law, together with the Jean Monnet Centre of Excellence on Digital Governance, 2021
- Paradiseo: From a Modular Framework for Evolutionary Computation to the Automated Design of Metaheuristics Genetic and Evolutionary Computation Conference (GECCO), 2021
- ECHR-OD: On Building an Integrated Open Repository of Legal Documents for Machine Learning Applications Information Systems, 2021
- GBEx, towards Graph-Based Explainations International Conference Tools with Artificial Intelligence (ICTAI), 2020
- On Integrating and Classifying Legal Text Documents International Conference on Database and Expert Systems Applications (DEXA), 2020
- Two-stage Optimization for Machine Learning Workflow Information Systems, 2020
- Binary Classification In Unstructured Space With Hypergraph Case-Based Reasoning Information Systems, 2019
- Data Pipeline Selection and Optimization International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data (DOLAP) @ International Conference on Extending Database Technology/International Conference on Database Theory (EDBT/ICDT) Joint Conference, 2019
- Binary Classification With Hypergraph Case-Based Reasoning International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data (DOLAP) @ International Conference on Extending Database Technology/International Conference on Database Theory (EDBT/ICDT) Joint Conference, 2018
- AI for the legal domain: an explainability challenge PhD Student Research Competition, IFIP World Computer Congress, 2018
- Unsupervised Video Semantic Partitioning Using IBM Watson and Topic Modelling International Workshop on Data Analytics solutions for Real-LIfe APplications (DARLI-AP) @ International Conference on Extending Database Technology/International Conference on Database Theory (EDBT/ICDT) Joint Conference, 2018
- Data Science Techniques for Law and Justice: Current State of Research and Open Problems Advances in Databases and Information Systems (ADBIS) Workshops and Short papers, 2017
- Solving Large MultiZenoTravel Benchmarks with Divide-and-Evolve Learning and Intelligent Optimization (LION), 2015
- True Pareto Fronts for Multi-objective AI Planning Instances Evolutionary Computation in Combinatorial Optimization (EvoCOP), 2015
Awards
- IBM Innovation Award Restlessly reinvent – our company and ourselves, 2019
- Best Paper, International Workshop On Design, Optimization Languages and Analytical Processing of Big Data, Lisbon, 2018
- Grant from the Polish Academy of Science IFIP World Computer Congress PhD Student Research Competition, 2018
- IBM Analytics Hero Award Restlessly reinvent – our company and ourselves, 2018
- Best Paper, International Workshop On Design, Optimization Languages and Analytical Processing of Big Data, Vienna, 2018
- IBM Manager’s Choice Award x2 Dare to Create Original Ideas, 2016
Teaching & Supervision
Students and interns under my supervision:
- Graph-based linear explanation for supervised machine learning models Pawel Mroz, Master Thesis, 2018 - 2019
- Design and implementation of a technique to assess regressions associated to GitHub Pull Request Laetitia Beignon, Internship, Sum. 2019
- Improving predictions of the European Court of Human Rights decisions Amadeusz Masny, Internship, Spr. 2019
- Hyperparameter Tuning: state-of-the-art and benchmarking Sylwia Wronia, Internship, Spr. 2019
- Hyperparameter optimization of Split-and-Merge, a semantic partitioning algorithm Pawel Rzonca, Internship, Sum. 2018
I have participated in the following graduate and undergraduate courses:
- Theoretical Machine Learning Lectures at IBM Krakow Software Lab, 2018 - 2019
- IBM Watson Services Overview Regular presentation at polish universities, 2016 - 2019
Talks
- A Better Approach to Data Science: the example of COVID 19 HackYeah, Online Webinar, 2020
- PCI Passthrough with Consumer GPU IBM Vitality Talks Cracow, Poland, 2020
- Is practical AutoML more than CASH? GHOST Day: a practical machine learning conference Poznan, Poland, 2019
- Towards Data Pipeline Selection and Optimization IBM CEE Regional Technical Exchange Budapest, Hungary, 2019
- Towards Data Pipeline Optimization PyData Warsaw Cracow, Poland, 2018
- AI for the legal domain: an explainability challenge (extended) IBM Vitality Talks Cracow, Poland, 2018
- Artificial Intelligence Microservices for NLP IBM Vitality Talks Cracow, Poland, 2018
- Can we really compare our algorithms? Beyond worst-case time complexity IBM Vitality Talks Cracow, Poland, 2017
- IBM Watson Services in Scala ScalaSphere Cracow, Poland, 2017
- Data Science Techniques for Law and Justice: Current State of Research and Open Problems IBM Vitality Talks Cracow, Poland, 2017
- Intelligent Home Automation, combining IoT and Machine Learning KrakYourNet 7 Cracow, Poland, 2016
- CESTAC: Stochastic estimation and control of rounding floatting point errors IBM Vitality Talks Cracow, Poland, 2016
- General Parallel File System (GPFS) presentation and administration IBM Vitality Talks Cracow, Poland, 2016