Daniel Ortiz Martínez is a researcher in biomedicine at the Barcelonas's Institute for Research in Biomedicine (IRB). Formerly he was a natural language processing engineer and technical leader at Webinterpret and visiting lecturer at the Master's Degree in Data Science offered by the University of Valencia. Daniel was also a member of the PRHLT research centre as well as a lecturer at the Statistics Department of the Technical University of Valencia. His research interests are focused on the fields of machine learning and data science and their application in biomedical and natural language processing research. Daniel has worked in several research projects, including the MIPRCV project (funded by the Spanish Government and the European Commission within the Consolider programme) and the CASMACAT project (funded by the European Commission under the Seventh Framework Programme). He has published more than 50 research papers in international conferences and journals and has co-supervised one PhD thesis. Additionally, he has served as a scientific reviewer for the European Commission as well as for different scientific and program committes of conferences and journals. Daniel is also the creator and maintainer of the open source Thot toolkit, a software package for statistical machine translation, as well as two additional packages focused on the discipline of bioinformatics.
April 2015-Jan 2016
Dec 2010-Jan 2016
Feb 2012-Dec 2014
Mar 2011-May 2011
Jul 2008-Jan 2012
Mar 2003-Jun 2008
Daniel is the creator and maintainer of the Thot toolkit for statistical machine translation. This toolkit, written in C, C++, Python and shell scripting and publicly available under LGPL license, is composed of more than 50 000 lines of code and offers many useful tools for statistical modelling and search in the field of statistical machine translation. The Thot toolkit is strongly focused on the use of online learning techniques to incrementally train statistical model parameters. Among the different functionalities provided by the toolkit, it can be found an implementation of the incremental EM algorithm for HMM models that can be applied on datasets of an arbitrary size using Map-Reduce. The Thot toolkit has been one of the two official statistical toolkits used within the CASMACAT project.
Daniel also created and currently maintains two software packages related to bioinformatics: the Flux Capacitor toolkit for systems biology and the snptools package useful to work with single nucleotide polymorphisms (SNPs).