Daniel Ortiz-Martínez is a natural language processing engineer and technical leader at Webinterpret and visiting lecturer of the Master's Degree in Data Science offered by the University of Valencia. Formerly, he was a member of the PRHLT research centre as well as a lecturer at the Statistics Department of the Technical University of Valencia. His research interests are focused on the field of pattern recognition and machine learning and its application to statistical machine translation. His most recent work consists in the application of online learning techniques to incrementally train the model parameters of statistical translation tools. Daniel has worked in several research projects, including the MIPRCV project (funded by the Spanish Government and the European Commission within the Consolider programme) and the CASMACAT project (funded by the European Commission under the Seventh Framework Programme). He has published more than 50 research papers in international conferences and journals and has co-supervised one PhD thesis. Additionally, he has served as a scientific reviewer for the European Commission as well as for different scientific and program committes of conferences and journals. Daniel is also the creator and maintainer of the open source Thot toolkit, a software package for statistical machine translation. Finally, Daniel is also interested in the field of Bioinformatics, completing a Master's degree in this area offered by the University of Valencia, and being the creator and maintainer of two bioinformatics open source software packages.
April 2015-Jan 2016
Dec 2010-Jan 2016
Feb 2012-Dec 2014
Mar 2011-May 2011
Jul 2008-Jan 2012
Mar 2003-Jun 2008
Daniel is the creator and maintainer of the Thot toolkit for statistical machine translation. This toolkit, written in C, C++, Python and shell scripting and publicly available under LGPL license, is composed of more than 50 000 lines of code and offers many useful tools for statistical modelling and search in the field of statistical machine translation. The Thot toolkit is strongly focused on the use of online learning techniques to incrementally train statistical model parameters. Among the different functionalities provided by the toolkit, it can be found an implementation of the incremental EM algorithm for HMM models that can be applied on datasets of an arbitrary size using Map-Reduce. The Thot toolkit has been one of the two official statistical toolkits used within the CASMACAT project.
Daniel also created and currently maintains two software packages related to bioinformatics: the Flux Capacitor toolkit for systems biology and the snptools package useful to work with single nucleotide polymorphisms (SNPs).