What?

Software

| Relation Extraction | Named-Entity Recognition and Named-Entity Linking | Semantic Similarity | Recommender Systems | Corpora and Datasets | Challenges and Workshops Participations | Tutorials | Others |

Relation Extraction

  • BiOnt [ github ] Deep learning system to perform relation extraction of gene-products, phenotypes, diseases, and chemical compounds.
  • BO-LSTM [ github ] Deep learning system to perform relation extraction of drug-drug and gene-phenotype relations.

Named-Entity Recognition and Named-Entity Linking

  • NILINKER [ github ] Attention-based approach to NIL entity linking.
  • REEL [ github ] Relation Extraction for Entity Linking.
  • MER [ github | Web Tool ] Minimal Named-Entity Recognizer. Named-Entity Recognition tool which given any lexicon and any input text returns the list of terms recognized in the text, including their exact location (annotations).
  • merpy [ github ] MER Python interface.
  • PPR-SSM [ github ] Personalized PageRank and Semantic Similarity Measures for Entity Linking.

Semantic Similarity

  • DiShIn [ github | Web Tool ] Semantic Similarity Measures using Disjunctive Shared Information.
  • JaccardSimilarity [ github ] Calculates the Jaccard Similarity Coefficient between two sets of terms based on their ancestors in an OBO ontology.

Recommender Systems

  • ChemRecSys [ github ] Chemical Compounds Recommender System.
  • DRecPy [ github ] Deep Recommenders with Python. A Python library for building Deep Learning based Recommender.

Corpora and Datasets

  • CheRM [ github ] Chemical Compounds Recommender Matrix. Dataset of Chemical Compounds.
  • cARM [ github ] Creates a dataset suitable for evaluating recommender systems for Open Cluster of Stars using Scientific Literature.
  • NPDR [ github ] A Dataset of Negative Human Phenotype-Disease Relations. Describes a subset of negative disease-phenotype relations from a gold-standard knowledge base made available by the Human Phenotype Ontology.
  • PGR-crowd [ github ] A hybrid approach toward biomedical relation extraction training corpora: combining distant supervision with crowdsourcing.
  • BiQA [ github ] Generating Scientific Question Answering Corpora from Q&A forums.
  • PGR [ github ] A Silver Standard Corpus of Human Phenotype-Gene Relations.

Challenges and Workshops Participations

  • BioCreative VII Track 1 – Text mining drug and chemical-protein interactions (DrugProt) [ github ]
  • BioASQ Task Synergy v2[ github ]
  • MESINESP2: Medical Semantic Indexing in Spanish Shared Task [ github ]
  • BioASQ Task 9B [ github ]
  • ProfNER-ST: Social Media Mining for Health Applications (#SMM4H) Shared Task 2021 (Track A & B) [ github ]
  • 7th Biomedical Linked Annotation Hackathon (BLAH7) [ github ]
  • 6th Biomedical Linked Annotation Hackathon (BLAH6) [ github ]
  • CANTEMIST- CANcer TExt Mining Shared Task – tumor named entity recognition [ github ]
  • 1st edition of BioASQ MESINESP task [ github ]

Tutorials

  • RecSys.Scifi: Recommender Systems Datasets in Scientific Fields. Tutorial at ACM Conference on Knowledge Discovery and Data Mining (KDD) [ github | WebPage ]

Others

Other tools developed during the project are available as open source at github    [ lasigeBioTM ]

Some are also available as web tools: http://labs.rd.ciencias.ulisboa.pt/

Publications

Books | Chapters | Journal papers | Conference papers | Workshop papers | Technical reports | Theses | Talks and Tutorials | Proceedings |

Books

  • F. Couto, Data and Text Processing for Health and Life Sciences. No. 1137 in Advances in Experimental Medicine and Biology, Springer, 2019. [ DOI | pdf | scholar ]
  • F. Couto, Introdução à Bioinformática Via Linha de Comando. Trajectos / Ciência, Gradiva, 2019. [ url ]

Chapters

  • D. Sousa, A. Lamurias, and F. Couto, “Using neural networks for relation extraction from biomedical literature,” in Artificial Neural Networks. Methods in Molecular Biology (H. Cartwright, ed.), vol. 2190, pp. 289–305, Humana, New York, NY, 2020. [ DOI | pdf | scholar ]
  • J. Ferreira and F. Couto, “Semantic similarity in cheminformatics,” in Cheminformatics and its Applications (A. Stefaniu, ed.), IntechOpen, 2019. [ DOI | pdf | scholar ]
  • A. Lamurias and F. Couto, “Text mining for bioinformatics using biomedical literature,” in Encyclopedia of Bioinformatics and Computational Biology (S. Ranganathan, K. Nakai, C. Schönbach, and M. Gribskov, eds.), vol. 1, pp. 602–611, Oxford: Elsevier, 2019. [ DOI | pdf | scholar ]
  • F. Couto and A. Lamurias, “Semantic similarity definition,” in Encyclopedia of Bioinformatics and Computational Biology (S. Ranganathan, K. Nakai, C. Schönbach, and M. Gribskov, eds.), vol. 1, pp. 870–876, Oxford: Elsevier, 2019. [ DOI | pdf | scholar ]

Journal papers

  • S. Conceição and F. Couto, “Text mining for building biomedical networks using cancer as a case study,” biomolecules, vol. 11, no. 10, pp. 1430 (1–12), 2021. [ DOI | pdf | pubmed | scholar ]
  • A. Rodrigues, J. Santinha, B. Galvão, C. Matos, F. Couto, and N. Papanikolaou, “Prediction of prostate cancer disease aggressiveness using bi-parametric mri radiomics,” cancers, vol. 13, no. 23, pp. 6065 (1–17), 2021. [ DOI | pdf | pubmed | scholar ]
  • M. Barros, P. Ruas, D. Sousa, A. Bangash, and F. Couto, “COVID-19 recommender system based on an annotated multilingual corpus,” Genomics & Informatics, vol. 19, no. 3, pp. e24 (1–7), 2021. [ DOI | pdf | pubmed | scholar ]
  • A. Lamurias, S. Jesus, V. Neveu, R. Salek, and F. Couto, “Information retrieval using machine learning for biomarker curation in the exposome-explorer,” Frontiers in Research Metrics and Analytics, vol. 6, no. 55, pp. 1–10, 2021. [ DOI | pdf | pubmed | scholar ]
  • M. Barros, A. Moitinho, and F. Couto, “Hybrid semantic recommender system for chemical compounds in large‐scale datasets,” Journal of Cheminformatics, vol. 13, no. 15, pp. 1–15, 2021. [ DOI | pdf | pubmed | scholar ]
  • D. Sousa, A. Lamurias, and F. Couto, “A hybrid approach toward biomedical relation extraction training corpora: combining distant supervision with crowdsourcing,” Database: The Journal of Biological Databases and Curation, vol. 2020, no. baaa104, pp. 1–15, 2020. [ DOI | pdf | pubmed | scholar ]
  • P. Ruas, A. Lamurias, and F. Couto, “Linking chemical and disease entities to ontologies by integrating PageRank with extracted relations from literature,” Journal of Cheminformatics, vol. 12, no. 57, pp. 1–15, 2020. [  DOI | pdf | pubmed | scholar ]
  • A. Lamurias, D. Sousa, and F. Couto, “Generating biomedical question answering corpora from Q&A forums,” IEEE Access, vol. 8, no. 0, pp. 161042–161051, 2020. [ DOI | pdf | scholar ]
  • S. Nunes, S. Little, S. Bhatia, L. Boratto, G. Cabanac, R. Campos, F. Couto, S. Faralli, I. Frommholz, A. Jatowt, A. Jorge, M. Marras, P. Mayr, and G. Stilo, “ECIR 2020 workshops: Assessing the impact of going online,” SIGIR Forum, vol. 54, no. 1, pp. 1–11, 2020. [ DOI | pdf | scholar ]
  • D. Sousa, A. Lamurias, and F. Couto, “Improving accessibility and distinction between negative results in biomedical relation extraction,” Genomics & Informatics, vol. 18, no. 1, p. e15, 2020. [ DOI | pdf | pubmed | scholar ]
  • M. Fernandes, J. Decouchant, M. Völp, F. Couto, and P. Veríssimo, “DNA-SeAl: Sensitivity levels to optimize the performance of privacy-preserving dna alignment,” IEEE Journal of Biomedical and Health Informatics, vol. 24, no. 3, pp. 907–915, 2020. [ DOI | pdf | pubmed | scholar ]
  • M. Asif, H. Martiniano, A. Marques, J. Santos, J. Vilela, C. Rasga, G. Oliveira, F. Couto, and A. Vicente, “Identification of biological mechanisms underlying a multidimensional ASD phenotype using machine learning,” Translational Psychiatry, vol. 10, no. 43, pp. 1–12, 2020. [ DOI | pdf | pubmed | scholar ]
  • M. Barros, A. Moitinho, and F. Couto, “Using research literature to generate datasets of implicit feedback for recommending scientific items,” IEEE Access, vol. 7, no. 0, pp. 176668 — 176680, 2019. [ DOI | pdf | scholar ]
  • A. Lamurias, P. Ruas, and F. Couto, “PPR-SSM: personalized pagerank and semantic similarity measures for entity linking,” BMC Bioinformatics, vol. 20, no. 534, pp. 1–12, 2019. [ DOI | pdf | pubmed | scholar ]
  • M. Asif, A. Vicente, and F. Couto, “Funvar: A systematic pipeline to unravel the convergence patterns of genetic variants in ASD, a paradigmatic complex disease,” Journal of Biomedical Informatics, pp. 1–29, 2019. [ DOI | pdf | pubmed | scholar ]
  • J. Ferreira and F. Couto, “Multi-domain semantic similarity in biomedical research,” BMC Bioinformatics, vol. 20, no. 246, pp. 1–9, 2019. [ DOI | pdf | pubmed | scholar ]
  • A. Lamurias, D. Sousa, L. Clarke, and F. Couto, “BO-LSTM: classifying relations via long short-term memory networks along biomedical ontologies,” BMC Bioinformatics, vol. 20, no. 10, pp. 1–12, 2019. [ DOI | pdf | pubmed | scholar ]
  • F. Couto and A. Lamurias, “MER: a shell script and annotation server for minimal named entity recognition and linking,” Journal of Cheminformatics, vol. 10, no. 58, pp. 1–10, 2018. [ DOI | pdf | pubmed | scholar ]
  • M. Asif, H. Martiniano, A. Vicente, and F. Couto, “Identifying disease genes using machine learning and gene functional similarities, assessed through Gene Ontology,” PLoS ONE, vol. 13, no. 12, pp. 1–15, 2018. [ DOI | pdf | pubmed | scholar ]
  • J. Santana, K. Gramacho, K. Ferreira, R. Rezende, P. Mangabeira, R. Dias, F. Couto, and C. Pirovani, “Witches’ broom resistant genotype CCN51 shows greater diversity of symbiont bacteria in its phylloplane than susceptible genotype catongo,” BMC Microbiology, vol. 18, no. 1, pp. 194 (1–10), 2018. [ DOI | pdf | pubmed | scholar ]
  • J. Decouchant, M. Fernandes, M. Völp, F. Couto, and P. Veríssimo, “Accurate filtering of privacy-sensitive information in raw genomic data,” Journal of Biomedical Informatics, vol. 82, pp. 1–12, 2018. [ DOI | pdf | pubmed | scholar ]

Conference papers

  • R. West, S. Bhagat, P. Groth, M. Zitnik, F. M. Couto, P. Lisena, A. M. no Peñuela, X. Zhao, W. Fan, D. Yin, J. Tang, L. Shou, M. Gong, J. Pei, X. Geng, X. Zhou, D. Jiang, B. Ricaud, N. Aspert, V. Miz, J. Dy, S. Ioannidis, I. Yildiz, R. Rezapour, S. Aref, L. Dinh, J. Diesner, A. Drutsa, D. Ustalov, N. Popov, D. Baidakova, S. Mishra, A. Gopalan, D. Juan, C. I. Magalhaes, C. Ferng, A. Heydon, C. Lu, P. Pham, G. Yu, Y. Fan, Y. Wang, F. Laurent, Y. Schraner, C. Scheller, S. Mohanty, J. Chen, X. Wang, F. Feng, X. He, I. Teinemaa, J. Albert, D. Goldenberg, F. Vasile, D. Rohde, O. Jeunen, A. Benhalloum, O. Sakhi, Y. Rong, W. Huang, T. Xu, Y. Bian, H. Cheng, F. Sun, J. Huang, S. Fakhraei, C. Faloutsos, O. Çelebi, M. Müller, M. Schneider, O. Altunina, W. Wingerath, B. Wollmer, F. Gessert, S. Succo, N. Ritter, E. Courdier, T. M. Avram, D. Cvetinovic, L. Tsinadze, J. Jose, R. Howell, M. Koenig, M. Defferrard, K. Kenthapadi, B. Packer, M. Sameki, and N. Sephus, “Summary of tutorials at The Web Conference 2021,” in Companion Proceedings of the Web Conference 2021, p. 727–733, 2021. [ pdf | url ]
  • F. Colaço, M. Barros, and F. Couto, “DRecPy: A python framework for developing deep learning-based recommenders,” in 14th ACM Conference on Recommender Systems (RECSYS 2020) (Late-breaking results), p. 675–680, 2020. [ pdf | url  | scholar ]
  • M. Barros, A. Moitinho, and F. Couto, “Hybrid semantic recommender system for chemical compounds,” in 42nd European Conference on Information Retrieval (ECIR 2020), vol. 12036, pp. 1–8, 2020. [ pdf | url | scholar ]
  • D. Sousa and F. Couto, “BiOnt: Deep learning using multiple biomedical ontologies for relation extraction,” in 42nd European Conference on Information Retrieval (ECIR 2020), vol. 12036, pp. 1–8, 2020. [ pdf | url | scholar ]
  • F. Couto and M. Krallinger, “Proposal of the first international workshop on semantic indexing and information retrieval for health from heterogeneous content types and languages (SIIRH),” in 42nd European Conference on Information Retrieval (ECIR 2020), vol. 12036, pp. 1–6, 2020. [ pdf | url | scholar ]
  • D. Sousa, A. Lamurias, and F. Couto, “A silver standard corpus of human phenotype-gene relations,” in Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2019), vol. N19-1152, p. 1487–1492, 2019. [ pdf | url | scholar ]

Workshop papers

  • D. Faria, B. Lima, M. Silva, F. Couto, and C. Pesquita, “AML and AMLC results for oaei 2021,” in 16th International Workshop on Ontology Matching (OM-2021), 2021. [ pdf | url ]
  • D. Sousa, R. Cassanheira, F. Couto, “lasigeBioTM at BioCreative VII Track 1: Text mining drug and chemical-protein interactions using biomedical ontologies,” in BioCreative VII Challenge Evaluation, 2021. [ pdf | url ]
  • M. Campos and F. Couto, “Post-processing BioBERT and using voting methods for biomedical question answering,” in 9th edition of BioASQ: Large-scale Biomedical Semantic Indexing and Question Answering (CLEF 2021 Working Notes), 2021. [ pdf | url ]
  • P. Ruas, V. Andrade, and F. Couto, “LASIGE-BioTM at MESINESP2: entity linking with semantic similarity and extreme multi-label classification on spanish biomedical documents,” in 9th edition of BioASQ: Large-scale Biomedical Semantic Indexing and Question Answering (CLEF 2021 Working Notes), 2021. [ pdf | url ]
  • P. Ruas, V. Andrade, and F. Couto, “Lasige-BioTM at ProfNER: BiLSTM-CRF and contextual spanish embeddings for named entity recognition and tweet binary classification,” in 6th Social Media Mining for Health Workshop, 2021. [ pdf | url ]
  • M. Barros, A. Lamurias, D. Sousa, P. Ruas, and F. Couto, “COVID-19: A semantic-based pipeline for recommending biomedical entities,” in 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020, 2020. [ pdf | url ]
  • A. Neves, A. Lamurias, and F. Couto, “Biomedical question answering using extreme multi-label classification and ontologies in the multilingual panorama,” in The Eighth edition of BioASQ: Large-scale Biomedical Semantic Indexing and Question Answering, 2020. [ pdfurl ]
  • P. Ruas, A. Lamurias, and F. Couto, “LasigeBioTM team at CLEF2020 ChEMU evaluation lab: Named entity recognition and event extraction from chemical reactions described in patents using BioBERT NER and RE,” in The workshop ChEMU: Named Entity Recognition and Event Extraction of Chemical Reactions from Patents, 2020. [ pdfurl ]
  • J. He, D. Nguyen, S. Akhondi, C. Druckenbrodt, C. Thorne, R. Hoessel, Z. Afzal, Z. Zhai, B. Fang, H. Yoshikawa, A. Albahem, J. Wang, Y. Ren, Z. Zhang, Y. Zhang, M. Dao, P. Ruas, A. Lamurias, F. Couto, J. Copara, N. Naderi, J. Knafou, P. Ruch, D. Teodoro, D. Lowe, J. Mayfield, A. Köksal, H. Dönmez, E. Özkırımlı, A. Özgür, D. Mahendran, G. Gurdin, N. Lewinski, C. Tang, B. McInnes, M. C.S., P. Rao, S. Devi, L. Cavedon, T. Cohn, T. Baldwin, and K. Verspoor, “An extended overview of the CLEF 2020 ChEMU lab: Information extraction of chemical reactions from patents,” in The workshop ChEMU: Named Entity Recognition and Event Extraction of Chemical Reactions from Patents, 2020. [ pdfurl ]
  • P. Ruas, A. Neves, V. Andrade, and F. Couto, “LasigeBioTM at CANTEMIST: Named entity recognition and normalization of tumour morphology entities and clinical coding of spanish health-related documents,” in The Iberian Languages Evaluation Forum (IberLEF 2020), 2020. [ pdfurl ]
  • A. Neves, A. Lamurias, and F. Couto, “Biomedical question answering using extreme multi-label classification and ontologies in the multilingual panorama,” in The First International Workshop on Semantic Indexing and Information Retrieval for Health from heterogeneous content types and languages (SIIRH), 2020. [ pdf | url ]
  • P. Ruas, A. Lamurias, and F. Couto, “Towards a multilingual corpus for named entity linking evaluation in the clinical domain,” in The First International Workshop on Semantic Indexing and Information Retrieval for Health from heterogeneous content types and languages (SIIRH), 2020. [ pdf | url ]
  • F. Couto and M. Krallinger, “Report of the first international workshop on semantic indexing and information retrieval for health from heterogeneous content types and languages,” in The First International Workshop on Semantic Indexing and Information Retrieval for Health from heterogeneous content types and languages (SIIRH), 2020. [ pdf | url ]

Theses

  • M. Barros, “Recommender system to support comprehensive exploration of large scale scientific datasets”, PhD Thesis, 2022.
  • L. Torcato, “Extracting Negative Biomedical Relations from Literature”, Master Thesis, 2021
  • V. Andrade, “Named Entity Recognition and Linking in a Multilingual Biomedical Setting”, Master Thesis, 2021.
  • M. Campos, “Biomedical Question Answering with Deep Learning”, Master Thesis, 2021.
  • M. Vieira, “Development of a Corpus for User-based Scientific Question Answering”, Master Thesis, 2021.
  • P. Santos, “Validation of Automatic Similarity Measures”, Master Thesis, 2021.
  • F. Colaço, “Recommender Systems Based on Deep Learning Techniques”, Master Thesis, 2020.
  • A. Neves, “Applying deep learning extreme multi-label classification to the biomedical and multilingual panoramas”, Master Thesis, 2020.
  • P. Ruas, “Exploring Biomedical Ontologies, Personalized PageRank and Semantic Similarity”, Master Thesis, 2019.
  • S. Jesus, “Information Retrieval using Machine Learning for Database Curation”, Master Thesis, 2019.
  • D. Sousa, “Extracting Phenotype-Gene Relations from Biomedical Literature Using Distant”, Master Thesis, 2019.
  • A. Lamurias, “Development of Text Mining Approach to Disease Network Discovery”, PhD Thesis, 2019.
  • T. Maldonado, “Extracting Biomedical Relations From Biomedical Literature”, Master Thesis, 2018.

Talks and Tutorials

  • M. Barros, F. Couto, M. Pato, and P. Ruas, “Creating recommender systems datasets in scientific fields.” Tutorial at the 30th The Web Conference (KDD2021), Singapore, August 2021.
  • F. Couto, “Exploring biomedical web resources using shell scripting.” Tutorial at the 30th The Web Conference (WWW2021), Ljubljana, Slovenia, April 2021.
  • F. Couto, “Biomedical text processing using semantics.” Tutorial at the 43rd European Conference on Information Retrieval (ECIR2021), Lucca, Italy, March 2021
  • F. Couto, “Multilingual resources and community Q&A forums.” Participation in the Discussion Panel of the 8th BioASQ 2020 Workshop, September 2020
  • F. Couto, “Biomedical data and text processing using shell scripting.” Tutorial at the 19th European Conference on Computational Biology (ECCB2020), Barcelona, Spain, September 2020
  • F. Couto, “Bioinformática: processamento de dados e texto biológico via linha de comando.” Palestra na II Semana de Biologia da Universidade Federal do Sul da Bahia, Brazil, September 2020
  • F. Couto, “Multilingual text mining: Overcoming the lack of clinical NLP resources.” Talk at the First Multilingual clinical NLP workshop (MUCLIN), Geneva, Switzerland, July 2020
  • F. Couto, “Introdução à bioinformática via linha de comando.” Tutorial na Faculdade de Ciências e Tecnologias da Universidade Nova de Lisboa, Caparica, December 2019
  • M. Barros, “Recommender Systems for Scientific Items”, Seminário na Universidade de Lisboa, Lisboa, 2019.
  • A. Lamurias, “Text mining approaches for Bioinformatics”, Seminário na Universidade de Lisboa, Lisboa, 2019.
  • D. Sousa, “Bioinformatics and computational biology”, Seminário na Universidade de Lisboa, Lisboa, 2019.
  • F. Couto, “Introdução à bioinformática via linha de comando.” Tutorial na V Encontro de Ecologia, Universidade de Coimbra, November 2019.
  • F. Couto, “Data and text processing in health and life sciences: an example driven workshop using shell scripting.” Tutorial na Universidade do Algarve, Faro, July 2019
  • F. Couto, “Data and text processing in health and life sciences: an example driven workshop using shell scripting.” Tutorial at the Bioinformatics Open Days, Universidade do Minho, Braga, February 2019
  • A. Lamurias, “Bioinformatics.”Palestra na Universidade da Beira Interior, Covilhã, 2019.
  • A. Lamurias, “Bioinformatics.”Text mining tools for biomedicine” Demo at the Cool tools for science event Champalimaud Foundation, Lisbon, 2019.
  • A. Lamurias, “Extracting microRNA-gene relations from biomedical literature using distant supervision”, Talk in the COST action meeting “The role of text mining in curation workflows”, Málaga, Spain, 2019.
  • A. Lamurias, “Text mining approaches for bioinformatics”, Seminário na Universidade de Lisboa, Lisboa, 2018.

Proceedings

  • F. Couto and M. Krallinger, eds., First International Workshop on Semantic Indexing and Information Retrieval for Health from heterogeneous content types and languages (SIIRH), vol. 2619, CEUR Workshop Proceedings (CEUR-WS.org), 2020. [ url ]