ICM – SBI Tokyo Collaboration

During the Covid-19 pandemic, ICM, in a research partnership with SBI Tokyo, organized workshops on Taxila. The goal of these workshops was to enable the scientific community to leverage the power of Big Data and supercomputers, along with Natural Language Processing (NLP) and Text Mining, to fight the pandemic.

The COVID-19-Taxila framework was presented during the meetings. This framework daily collected, aggregated, and organized information about COVID-19 from various sources, including PUBMED, Arxiv, ClinicalTrials.gov, and open COVID-19 research datasets. Through specific use cases and a practical approach, researchers from SBI showed workshop participants how to navigate Taxila and how to use specific analytical modules to quickly obtain practical information.

The free Text Mining workshops were held in groups of twenty people and were dedicated to medical topics: “Taxila: Empowering the fight against COVID-19 through text” and “Taxila global scientific literature text-mining intelligence for oncology research” (two editions). The events were attended by academic staff, doctors, and researchers from universities and medical schools in Gdańsk, Kraków, Lublin, Białystok, Katowice, and Warsaw, the National Oncology Institute (Warsaw, Gliwice), the Institute of Mother and Child, the International Institute of Molecular and Cell Biology, the Medical Research Agency, and several other research units. Participants received certificates of attendance.

ICM made the Taxila tool available based on a scientific cooperation agreement with SBI (The Systems Biology Institute) and based on licensed resources collected by the Virtual Library of Science (WBN), which are accessible to licensed Polish institutions. As part of the oncology workshops, the integration of Taxila with WBN allowed for the analysis of 25,000 full-text scientific articles, mainly from Springer and Elsevier journals. Jan Miśkiewicz from the HPC User Support team at ICM was responsible for the substantive service of the project and the organization of the meetings with SBI.


Taxila is a comprehensive analytical platform created by SBI Tokyo, which combines state-of-the-art Natural Language Processing and Natural Language Understanding (NLP/NLU) solutions, enabling the automatic analysis of text from hundreds of thousands of scientific articles. In particular, by operating on a vast collection of publications, Taxila allows for the generation of scientific hypotheses connecting various areas of knowledge contained in the text using tools such as: tag analysis, searching for correlations between concepts, and graph visualization.