Professor Rob Gaizauskas

BA, MA, DPhil

School of Computer Science

Professor of Natural Language Processing

Co-Director of CDT in Speech and Language Technologies

Member of the Natural Language Processing (NLP) research group

Rob Gaizuskas headshot
Profile picture of Rob Gaizuskas headshot
r.gaizauskas@sheffield.ac.uk

Full contact details

Professor Rob Gaizauskas
School of Computer Science
Regent Court (DCS)
211 Portobello
Sheffield
S1 4DP
Profile

Rob Gaizauskas studied Mathematics and Physics at the University of Toronto from 1972-74, then moved to Carleton University in Ottawa where he received an Honours BA in Philosophy in 1975 and an MA in Philosophy (with distinction) in 1978. Following two years teaching Logic as a temporary lecturer at Carleton he obtained a Diploma in Information Processing from Algonquin College, Ottawa, in 1981.

He then worked for several software companies in Ottawa, including Domus Software, Nabu Technologies, and Fulcrum Technologies (now part of Hummingbird), before moving to the U.K. in 1985, thanks to a Canadian SSHRC Doctoral Fellowship and British Council ORS award, to study for a DPhil in the School of Cognitive and Computing Sciences (now the Department of Informatics) at the University of Sussex.

He received his MA in Cognitive Studies in 1986 and was awarded his DPhil in 1992. During 1989 he lectured in Artificial Intelligence at Sussex. From 1990 to 1993 he worked as a Research Associate at the University of Sussex.

In 1993 he became a Lecturer in the Natural Language Processing Group of the Department of Computer Science, Sheffield University, became a Reader in Computer Science in the same group in 1999, and a Professor in 2002.

Research interests

Rob's research interests are in natural language processing, specifically in information extraction from natural language texts, software architectures for natural language processing and evaluation of language processing systems.

Publications

Books

  • (2005) . Oxford University PressOxford.

Journal articles

  • Tait J, Gaizauskas R & Bontcheva K (2023) . Computational Linguistics, 49(3), 767-772.
  • Tang Y, Wang JK, Wang X, Gao B, Dellandréa E, Gaizauskas R & Chen L (2018) . IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 3045-3058.
  • Gaizauskas R, Paramita ML, Barker E, Pinnis M, Aker A & Pahisa Solé M (2015) . Terminology, 21(2), 205-236.
  • Preiss J, Stevenson M & Gaizauskas R (2015) . Journal of the American Medical Informatics Association, 22(5), 987-992.
  • Aker A & Gaizauskas R (2015) . Journal of the Association for Information Science and Technology, 66(4), 721-738.
  • Aker A, Paramita ML, Barker E & Gaizauskas R (2014) Bootstrapping term extractors for multiple languages. Proceedings of the 9th International Conference on Language Resources and Evaluation Lrec 2014, 483-489.
  • Aker A, Paramita ML, Pinnis M & Gaizauskas R (2014) Bilingual dictionaries for all EU languages. Proceedings of the 9th International Conference on Language Resources and Evaluation Lrec 2014, 2839-2845.
  • Alhelbawy A & Gaizauskas R (2013) . Proceedings 2013 IEEE Wic ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology Workshops Wi Iatw 2013, 3, 159-162.
  • Derczynski L & Gaizauskas R (2013) . ACM International Conference Proceeding Series, 129-130.
  • Di Fabbrizio G, Aker A & Gaizauskas R (2013) Summarizing On-line Product and Service Reviews Using Aspect Rating Distributions and Language Modeling. IEEE Intelligent Systems.
  • Aker A, Plaza L & Lloret E (2012) Do humans have conceptual models about Geographic Objects? A user study. Journal of the American Society for Information Science and Technology (JASIST).
  • Aker A, Fan X, Sanderson M & Gaizauskas R (2012) . Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 7224 LNCS, 472-475.
  • Gaizauskas R, Barker E, Chang C-L, Derczynski L, Phiri M & Peng C (2012) Applying ISO-Space to Healthcare Facility Design Evaluation Reports. Seventh Workshop on Interoperable Semantic Annotation (ISA), Eighth International Conference on Language Resources and Evaluation, 13-20.
  • Di Fabbrizio G, Aker A & Gaizauskas R (2011) . Proceedings IEEE International Conference on Data Mining Icdm, 67-74.
  • Aker A & Gaizauskas R (2011) . International Conference on Information and Knowledge Management Proceedings, 1929-1932.
  • Skadiņa I, Aker A, Giouli V, Tufis D, Gaizauskas R, Mieriņa M & Mastropavlos N (2010) . Frontiers in Artificial Intelligence and Applications, 219, 161-168.
  • Verhagen M, Gaizauskas RJ, Schilder F, Hepple M, Moszkowicz J & Pustejovsky J (2009) The TempEval challenge: identifying temporal relations in text.. Lang. Resour. Evaluation, 43, 161-179.
  • Aker A & Gaizauskas R (2009) Summary generation for toponym-referenced images using object type language models. International Conference Recent Advances in Natural Language Processing Ranlp, 6-11.
  • Stevenson M, Guo Y, Gaizauskas R & Martinez D (2008) . BMC Bioinformatics, 9 Suppl 11(Suppl 11), S7.
  • Roberts A, Gaizauskas R, Hepple M & Guo Y (2008) . BMC Bioinformatics, 9 Suppl 11(Suppl 11), S3.
  • Roberts A, Gaizauskas R, Hepple M, Davis N, Demetriou G, Guo Y, Kola J, Roberts I, Setzer A, Tapuria A & Wheeldin B (2007) The CLEF corpus: semantic annotation of clinical text.. AMIA Annu Symp Proc, 2007, 625-629.
  • Harkema H, Roberts I, Gaizauskas R & Hepple M (2005) . Comp Funct Genomics, 6(1-2), 86-93.
  • Baker P, Hardie A, McEnery T, Xiao R, Bontcheva K, Cunningham H, Gaizauskas RJ, Hamza O, Maynard D, Tablan V , Ursu C et al (2004) Corpus Linguistics and South Asian Languages: Corpus Creation and Tool Development.. Lit. Linguistic Comput., 19, 509-524.
  • Gaizauskas R, Davis N, Demetriou G, Guo Y & Roberts I (2004) . Proceedings 2004 IEEE International Conference on Services Computing Scc 2004, 145-152.
  • Gaizauskas R (2003) Recent advances in computational terminology. COMPUT LINGUIST, 29(2), 328-332.
  • Gaizauskas RJ, Demetriou G, Artymiuk PJ & Willett P (2003) . Bioinform., 19, 135-143.
  • Hirschman L & Gaizauskas RJ (2001) . Nat. Lang. Eng., 7, 275-300.
  • Humphreys K, Demetriou G & Gaizauskas R (2000) . Pac Symp Biocomput, 505-516.
  • Humphreys K, Demetriou G & Gaizauskas R (2000) Bioinformatics applications of information extraction from scientific journal articles. J INFORM SCI, 26(2), 75-85.
  • Humphreys K, demetriou G & Gaizauskas R (2000) . Journal of Information Science, 26(2), 75-85.
  • Krotov A, Hepple M, Gaizauskas RJ & Wilks Y (1999) Evaluating two methods for Treebank grammar compaction.. Nat. Lang. Eng., 5, 377-394.
  • Gaizauskas RJ (1998) Karen Sparck Jones and Julia Galliers, Evaluating Natural Language Processing Systems: An Analysis and Review. Berlin: Springer-Verlag, 1996. ISBN 3 540 61309 9, Price DM54.00 (paperback), 228 pages.. Nat. Lang. Eng., 4, 175-190.
  • Gaizauskas R & Wilks Y (1998) Information extraction: Beyond document retrieval. J DOC, 54(1), 70-105.
  • GAIZAUSKAS R & HUMPHREYS K (1997) . Natural Language Engineering, 3(2), 147-169.
  • Cunningham H, Wilks Y & Gaizauskas RJ (1996) New Methods, Current Trends and Software Infrastructure for NLP. Proceedings of NEMLAP-2.
  • Cunningham H, Gaizauskas RJ & Wilks Y (1996) A General Architecture for Language Engineering (GATE) - a new approach to Language Engineering R&D.
  • Evans R, Gaizauskas R, Cahill LJ, Walker J, Richardson J & Dixon A (1995) . Natural Language Engineering, 1(4), 363-388.

Book chapters

  • Paramita ML, Aker A, Clough P, Gaizauskas R, Glaros N, Mastropavlos N, Yannoutsou O, Ion R, Ștefănescu D, CeauÅŸu A , TufiÈ™ D et al (2019) In Skadiņa I, Gaizauskas R, Babych B, LjubeÅ¡ić N, TufiÅŸ D & Vasiļjevs A (Ed.), Using Comparable Corpora for Under-Resourced Areas of Machine Translation (pp. 55-87). Springer
  • Skadiņa I, Gaizauskas R, Vasiļjevs A & Paramita ML (2019) In Skadiņa I, Gaizauskas R, Babych B, LjubeÅ¡ić N, TufiÅŸ D & Vasiļjevs A (Ed.), Using Comparable Corpora for Under-Resourced Areas of Machine Translation (pp. 1-11). Springer
  • Babych B, Su F, Hartley A, Aker A, Paramita ML, Clough P & Gaizauskas R (2019) In Skadiņa I, Gaizauskas R, Babych B, LjubeÅ¡ić N, TufiÅŸ D & Vasiļjevs A (Ed.), Using Comparable Corpora for Under-Resourced Areas of Machine Translation (pp. 13-53). Springer
  • Aker A, Ion R, Mastropavlos N, Paramita M, Pinnis M, Åžtefănescu D, Su F, Thurmair G, Irimia E, LjubeÅ¡ić N , Kanoulas E et al (2019) , Theory and Applications of Natural Language Processing (pp. 291-323). Springer International Publishing
  • Aker A, CeauÈ™u A, Feng Y, Gaizauskas R, Hunsicker S, Ion R, Irimia E, Ștefănescu D & TufiÈ™ D (2019) , Theory and Applications of Natural Language Processing (pp. 141-188). Springer International Publishing
  • (2013) In Sharoff S, Rapp R, Zweigenbaum P & Fung P (Ed.) Springer Berlin Heidelberg
  • Di Fabbrizio G, Stent AJ & Gaizauskas R (2013) , Mobile Speech and Advanced Natural Language Solutions (pp. 289-317). Springer New York
  • (2013) In Neustein A & Markowitz JA (Ed.) Springer New York
  • Paramita ML, Guthrie D, Kanoulas E, Gaizauskas R, Clough P & Sanderson M (2013) , Building and Using Comparable Corpora (pp. 93-112). Springer Berlin Heidelberg
  • Aker A, Plaza L, Lloret E & Gaizauskas R (2012) In Poibeau T, Saggion H, Piskorski J & Yangarber R (Ed.), Theory and Applications of Natural Language Processing (pp. 299-320). Springer
  • Clough P & Gaizauskas R (2009) , Corpus Linguistics (pp. 1249-1271). Mouton de Gruyter
  • Clough P & Gaizauskas R (2009) Corpora and text re-use, CORPUS LINGUISTICS, PART 2 (pp. 1249-1271).
  • Setzer A, Gaizauskas, R & Hepple M (2005) , The Language Of Time (pp. 575-584). Oxford University PressOxford
  • Pustejovsky J, Ingria R, Sauri´ R, O JEC, Littman J, Gaizauskas R, Setzer A, Katz, G & Mani I (2005) , The Language Of Time (pp. 545-558). Oxford University PressOxford
  • Gaizauskas R & Humphreys K (2000) , Studies in Corpus Linguistics (pp. 145-170). John Benjamins Publishing Company
  • Wilks Y & Gaizauskas R (1999) , Text, Speech and Language Technology (pp. 197-214). Springer Netherlands
  • Gaizauskas R, Saggion H & Barker E () , Text, Speech and Language Technology (pp. 85-105). Springer Netherlands
  • Gaizauskas R & Barker EJ () , The Kluwer International Series on Information Retrieval (pp. 195-238). Springer-Verlag

Conference proceedings

  • Clayton J, Damonte M & Gaizauskas R (2024) . Computational Models of Argument: Proceedings of COMMA 2024, Vol. 388 (pp 37-48). Hagen, Germany, 18 September 2024 - 18 September 2024.
  • Booth CW, Thomas A & Gaizauskas R (2024) BLN600: A parallel corpus of machine/human transcribed nineteenth century newspaper texts. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (pp 2440-2446). Torino, Italia, 20 May 2024 - 20 May 2024.
  • Thomas A, Gaizauskas R & Lu H (2024) Leveraging LLMs for Post-OCR Correction of Historical ¾Ã²Ý¸£Àûpapers. 3rd Workshop on Language Technologies for Historical and Ancient Languages, LT4HALA 2024 at LREC-COLING 2024 - Workshop Proceedings (pp 116-121)
  • Alrashid T & Gaizauskas R (2023) ScANT: A Small Corpus of Scene-Annotated Narrative Texts[resource papers]. Ceur Workshop Proceedings, Vol. 3370 (pp 143-149)
  • Booth CW, Shoemaker R & Gaizauskas R (2022) A Language Modelling Approach to Quality Assessment of OCR'ed Historical Text. 2022 Language Resources and Evaluation Conference Lrec 2022 (pp 5859-5864)
  • Barker E, Barker J, Gaizauskas R, Ma N & Paramita ML (2022) SNuC: The Sheffield Numbers Spoken Language Corpus. Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp 1978-1984). Marseille, France, 20 June 2022 - 20 June 2022.
  • Aldihan H, Gaizauskas R & Fitzmaurice S (2022) . Proceedings of the The Seventh Arabic Natural Language Processing Workshop (WANLP) (pp 372-380), December 2022 - December 2022.
  • Alrashid T & Gaizauskas R (2021) A pilot study on annotating scenes in narrative text using sceneML. Ceur Workshop Proceedings, Vol. 2860 (pp 7-14)
  • Paramita ML, Clough P & Gaizauskas R (2017) . ECIR 2017: Advances in Information Retrieval (10193) (pp 663-669). Aberdeen, UK
  • Funk A, Aker A, Barker E, Paramita ML, Hepple M & Gaizauskas R (2017) . Advances in Information Retrieval. ECIR 2017.(10193) (pp 758-761). Aberdeen, UK, 8 April 2017 - 8 April 2017.
  • Tang Y, Wang JK, Gao B, Dellandréa E, Gaizauskas R & Chen L (2016) . 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, Nevada, 26 June 2016 - 26 June 2016.
  • Aker A, Paramita M, Kurtic E, Funk A, Barker E, Hepple M & Gaizauskas R (2016) . Proceedings of the 9th International Natural Language Generation Conference (pp 61-69). Edinburgh, UK, 5 September 2016 - 5 September 2016.
  • Gilbert A, Piras L, Wang JK, Yan F, Ramisa A, Dellandrea E, Gaizauskas R, Villegas M & Mikolajczyk K (2016) Overview of the ImageCLEF 2016 Scalable Concept Image Annotation Task. CLEF 2016 Working Notes (pp 254-278). Évora, Portugal, 5 September 2016 - 5 September 2016.
  • Wang J & Gaizauskas R (2016) Don't mention the shoe! A learning to rank approach to content selection for image description generation. Proceedings of the 9th International Natural Language Generation conference (pp 193-202). Edinburgh, Scotland, 5 September 2016 - 5 September 2016.
  • Barker E, Paramita ML, Aker A, Kurtic E, Hepple M & Gaizauskas R (2016) . Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue (pp 42-52). Los Angeles, USA, 13 September 2017 - 13 September 2017.
  • Villegas M, Müller H, García Seco de Herrera A, Schaer R, Bromuri S, Gilbert A, Piras L, Wang JK, Yan F, Ramisa A , Dellandrea E et al (2016) . Experimental IR Meets Multilinguality, Multimodality, and Interaction, Vol. 9822. Évora, Portugal
  • Barker E, Paramita M, Funk A, Kurtic E, Aker A, Foster J, Hepple M & Gaizauskas R (2016) What's the issue here?: Task-based evaluation of reader comment summarization systems. Proceedings of LREC 2016, Tenth International Conference on Language Resources and Evaluation (pp 2094-3101). Portorož, Slovenia, 23 May 2016 - 23 May 2016.
  • Riccardi G, Bechet F, Danieli M, Favre B, Gaizauskas R, Kruschwitz U & Poesio M (2016) (pp 10-33)
  • Funk A, Gaizauskas R & Favre B (2016) A document repository for social media and speech conversations. Proceedings of the 10th International Conference on Language Resources and Evaluation Lrec 2016 (pp 436-440)
  • Aker A, Paramita M, Kurtic E, Funk A, Barker E, Hepple M & Gaizauskas R (2016) . Proceedings of the 9th International Natural Language Generation conference (pp 61-69), 2016 - 2016.
  • Aker A, Kurtic E, Balamurali AR, Paramita M, Barker E, Hepple M & Gaizauskas R (2016) . Advances in Information Retrieval, Vol. 9626 (pp 15-29). Padua, Italy, 20 March 2016 - 20 March 2016.
  • Barker E & Gaizauskas R (2016) . Proceedings of the Third Workshop on Argument Mining (ArgMining2016) (pp 12-20), August 2016 - August 2016.
  • Ramisa A, Wang JK, Lu Y, Dellandrea E, Moreno-Noguer F & Gaizauskas R (2015) Combining Geometric, Textual and Visual Features for Predicting Prepositions in Image Descriptions. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp 214-220). Lisbon, Portugal, 17 September 2015 - 17 September 2015.
  • Gaizauskas R, Wang J & Ramisa A (2015) . Proceedings of the Fourth Workshop on Vision and Language (pp 10-17), September 2015 - September 2015.
  • Wang JK & Gaizauskas R (2015) Generating Image Descriptions with Gold Standard Visual Inputs: Motivation, Evaluation and Baselines. Proceedings of the 15th European Workshop on Natural Language Generation (ENLG) (pp 117-126). Brighton, UK, 10 September 2015 - 10 September 2015.
  • Gilbert A, Piras L, Wang JK, Yan F, Dellandrea E, Gaizauskas R, Villegas M & Mikolajczyk K (2015) Overview of the ImageCLEF 2015 Scalable Image Annotation, Localization and Sentence Generation Task. CEUR Workshop Proceedings. Toulouse, France, 8 September 2015 - 11 September 2015.
  • Aker A, Celli F, Funk JA, Kurtic E, Hepple M & Gaizauskas R (2015) Sheffield-Trento System for Sentiment and Argument Structure Enhanced Comment-to-Article Linking in the Online ¾Ã²Ý¸£Àû Domain (Ahmet Aker, Fabio Celli, Adam Funk, Emina Kurtic, Mark Hepple and Rob Gaizauskas). MultiLing 2015 in SIGDIAL. Prague, 2 September 2015 - 4 September 2015.
  • Derczynski L & Gaizauskas R (2015) Temporal relation classification using a model of tense and aspect. International Conference Recent Advances in Natural Language Processing Ranlp, Vol. 2015-January (pp 118-122)
  • Aker A, Kurtic E, Hepple M, Gaizauskas R & Di Fabbrizio G (2015) Comment-to-Article Linking in the Online ¾Ã²Ý¸£Àû Domain. Proceedings of the SIGDIAL 2015 Conference (pp 245-249). Prague, 2 September 2015 - 2 September 2015.
  • Wang JK, Yan F, Aker A & Gaizauskas R (2014) A Poodle or a Dog? Evaluating Automatic Image Annotation Using Human Descriptions at Different Levels of Granularity. Proceedings of the Workshop on Vision and Language 2014 (VL'14), in conjuction with the 25th International Conference on Computational Linguistics (COLING 2014). Dublin, 23 August 2014 - 23 August 2014.
  • Alhelbawy A & Gaizauskas R (2014) Collective named entity disambiguation using graph ranking and clique partitioning approaches. Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers (pp 1544-1555)
  • Aker A, Paramita ML, Pinnis M & Gaizauskas R (2014) Bilingual dictionaries for all EU languages. LREC 2014 Proceedings (pp 2839-2845). Reykjavik, Iceland, 26 May 2014 - 26 May 2014.
  • Aker A, Paramita ML, Barker E & Gaizauskas R (2014) Bootstrapping Term Extractors for Multiple Languages. Proceedings of the 9th LREC Conference (pp 483-489). Reykjavik, Iceland, 26 May 2014 - 26 May 2014.
  • Alhelbawy A & Gaizauskas R (2014) . Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), June 2014 - June 2014.
  • Di Fabbrizio G, Stent A & Gaizauskas R (2014) . Proceedings of the 8th International Natural Language Generation Conference (INLG) (pp 54-63), June 2014 - June 2014.
  • Gaizauskas R, Barker E, Paramita ML & Aker A (2014) . Proceedings of the 4th International Workshop on Computational Terminology (Computerm) (pp 11-21), August 2014 - August 2014.
  • Derczynski L & Gaizauskas R (2013) Empirical Validation of Reichenbach’s Tense Framework. International Conference on Computational Semantics. ACL
  • Derczynski L & Gaizauskas R (2013) Temporal Signals Help Label Temporal Relations. Proceedings of the 51st meeting of the Association for Computational Linguistics. ACL
  • Gaizauskas RJ, Aker A & Lestari Paramita M (2013) Extracting bilingual terminologies from comparable corpora. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Sofia, Bulgaria
  • Paramita M, Clough P, Aker A & Gaizauskas R (2012) Correlation between Similarity Measures for Inter-Language Linked Wikipedia Articles. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey.. Istanbul, Turkey, 21 May 2012 - 27 May 2012.
  • Skadina I, Aker A, Mastropavlos N, Su F, Tufis D, Verlic M, Vasiljevs A, Babych B, Clough P, Gaizauskas R , Glaros N et al (2012) Collecting and using comparable corpora for statistical machine translation. Proceedings of the 8th International Conference on Language Resources and Evaluation Lrec 2012 (pp 438-445)
  • Llorens H, Derczynski L, Gaizauskas RJ & Saquete E (2012) TIMEN: An Open Temporal Expression Normalisation Resource.. LREC (pp 3044-3051)
  • Barker E & Gaizauskas R (2012) Assessing the Comparability of ¾Ã²Ý¸£Àû Texts. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (pp 3996-4003)
  • Aker A, Kanoulas E & Gaizauskas R (2012) A light way to collect comparable corpora from the Web. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (pp 15-20)
  • Alhelbawy A & Gaizauskas R (2012) Named Entity Based Document Similarity with SVM-Based Re-ranking for Entity Linking. ADVANCED MACHINE LEARNING TECHNOLOGIES AND APPLICATIONS, Vol. 322 (pp 379-388)
  • Aker A, Cohn T & Gaizauskas R (2012) Redundancy reduction for multi-document summaries using A* search and discriminative training. Proceedings of the Workshop on Automatic Text Summarization of the Future. Spain
  • Aker A, Kanoulas E & Gaizauskas R (2012) A light way to collect comparable corpora from the Web. Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey. (pp 21-27)
  • Burman A, Jayapal A, Kannan S, Kavilikatta M, Alhelbawy A, Derczynski L & Gaizauskas R (2012) USFD at KBP 2011: Entity Linking, Slot Filling and Temporal Bounding
  • Derczynski L & Gaizauskas R (2012) A Corpus-based Study of Temporal Signals. Proceedings of the 6th Conference on Corpus Linguistics (2011), No. 197, pp. 1--8
  • Derczynski L & Gaizauskas R (2012) An Annotation Scheme for Reichenbach's Verbal Tense Structure. Proc. 6th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (2011) 10-17
  • Derczynski L & Gaizauskas R (2012) Using Signals to Improve Automatic Classification of Temporal Relations
  • Derczynski L & Gaizauskas R (2012) USFD2: Annotating Temporal Expresions and TLINKs for TempEval-2. Proc. 5th International Workshop on Semantic Evaluation (2010) 337-340
  • Derczynski L & Gaizauskas R (2012) Analysing Temporally Annotated Corpora with CAVaT. Proc. LREC (2010) 398-404
  • Derczynski L, Wang J, Gaizauskas R & Greenwood MA (2012) A Data Driven Approach to Query Expansion in Question Answering. Proc. IR4QA Workshop (2008) 34-41
  • Burman A, Jayapal A, Kannan S, Kavilikatta M, Alhelbawy A, Derczynski L & Gaizauskas RJ (2011) USFD at KBP 2011: Entity Linking, Slot Filling and Temporal Bounding.. TAC
  • Llorens H, Saquete E, Navarro B & Gaizauskas RJ (2011) Time-Surfer: Time-Based Graphical Access to Document Content.. ECIR, Vol. 6611 (pp 767-771)
  • Aker A & Gaizauskas R (2010) Generating image descriptions using dependency relational patterns. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL) (pp 1250-1258)
  • Aker A, Cohn T & Gaizauskas R (2010) Multi-document summarization using A* search and discriminative training. Proceedings of the 2010 Conference on Empirical Methods on Natural Language Processing (EMNLP) (pp 482-491). Cambridge, MA, USA
  • Fan X, Aker A, Tomko M, Smart P, Sanderson M & Gaizauskas RJ (2010) Automatic image captioning from the web for GPS photographs.. Multimedia Information Retrieval (pp 445-448)
  • Skadina I, Vasiljevs A, Skadins R, Gaizauskas R, Tufis D & Gornostay T (2010) Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation. LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (pp 6-14)
  • Aker A, Cohn T & Gaizauskas R (2010) Multi-document summarization using A* search and discriminative training. Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (pp 482-491)
  • Derczynski L & Gaizauskas RJ (2010) Analysing Temporally Annotated Corpora with CAVaT.. LREC
  • Aker A & Gaizauskas R (2010) Model Summaries for Location-related Images. Proc. of the 7th conference on International Language Resources and Evaluation
  • Aswani N & Gaizauskas RJ (2010) English-Hindi Transliteration using Multiple Similarity Metrics.. LREC
  • Aswani N & Gaizauskas RJ (2010) Developing Morphological Analysers for South Asian Languages: Experimenting with the Hindi and Gujarati Languages.. LREC
  • Catizone R, Dingli A & Gaizauskas RJ (2010) Using Dialogue Corpora to Extend Information Extraction Patterns for Natural Language Understanding of Dialogue.. LREC
  • Derczynski L & Gaizauskas RJ (2010) USFD2: Annotating Temporal Expresions and TLINKs for TempEval-2.. SemEval@ACL (pp 337-340)
  • Roberts A, Gaizauskas RJ, Hepple M, Demetriou G, Guo Y, Roberts I & Setzer A (2009) Building a semantically annotated corpus of clinical texts.. J. Biomed. Informatics, Vol. 42 (pp 950-966)
  • Stevenson M, Guo Y, Al Amri A & Gaizauskas R (2009) . Proceedings of the Workshop on BioNLP - BioNLP '09 (pp 71-71), 4 June 2009 - 5 June 2009.
  • Stevenson M, Guo Y, Gaizauskas R & Martinez D (2008) . Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing - BioNLP '08 (pp 80-80), 19 June 2008 - 19 June 2008.
  • Shaw R, Solway B, Gaizauskas R & Greenwood MA (2008) . Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering - IRQA '08 (pp 58-65), 24 August 2008 - 24 August 2008.
  • Gaizauskas R (2008) . Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization - MMIES '08 (pp 1-1), 23 August 2008 - 23 August 2008.
  • Aker A & Gaizauskas R (2008) . Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization - MMIES '08 (pp 41-41), 23 August 2008 - 23 August 2008.
  • Stevenson M, Guo Y & Gaizauskas R (2008) Acquiring Sense Tagged Examples using Relevance Feedback. Proceedings of the 22nd International Conference on Computational Linguistics (COLING-08). Manchester, UK
  • Roberts A, Gaizauskas R & Hepple M (2008) . Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing - BioNLP '08 (pp 10-10), 19 June 2008 - 19 June 2008.
  • Demetriou G, Gaizauskas RJ, Sun H & Roberts A (2008) ANNALIST - ANNotation ALIgnment and Scoring Tool.. LREC
  • Roberts A, Gaizauskas RJ, Hepple M & Guo Y (2008) Combining Terminology Resources and Statistical Methods for Entity Recognition: an Evaluation.. LREC
  • Verhagen M, Gaizauskas R, Schilder F, Hepple M, Katz G & Pustejovsky J (2007) . Proceedings of the 4th International Workshop on Semantic Evaluations - SemEval '07 (pp 75-80), 23 June 2007 - 24 June 2007.
  • Hepple M, Setzer A & Gaizauskas R (2007) . Proceedings of the 4th International Workshop on Semantic Evaluations - SemEval '07 (pp 438-441), 23 June 2007 - 24 June 2007.
  • Hepple M, Setzer A & Gaizauskas R (2007) USFD: Preliminary exploration of features and classifiers for the TempEval-2007 tasks. Acl 2007 Semeval 2007 Proceedings of the 4th International Workshop on Semantic Evaluations (pp 438-441)
  • Verhagen M, Gaizauskas R, Schilder F, Hepple M, Katz G & Pustejovsky J (2007) SemEval-2007 task 15: TempEval temporal relation identification. Acl 2007 Semeval 2007 Proceedings of the 4th International Workshop on Semantic Evaluations (pp 75-80)
  • Gaizauskas RJ, Harkema H, Hepple M & Setzer A (2006) . TIME (pp 188-195)
  • Davis N, Demetriou G, Gaizauskas RJ, Guo Y & Roberts I (2006) Web Service Architectures for Text Mining: An Exploration of the Issues via an E-Science Demonstrator.. Int. J. Web Serv. Res., Vol. 3 (pp 95-112)
  • Barker E, Higashinaka R, Mairesse F, Gaizauskas R, Walker M & Foster J (2006) Simulating cub reporter dialogues: The collection of naturalistic human-human dialogues for information access to text archives. Proceedings of the 5th International Conference on Language Resources and Evaluation Lrec 2006 (pp 125-130)
  • Saggion H & Gaizauskas R (2006) Language resources for background gathering. Proceedings of the 5th International Conference on Language Resources and Evaluation Lrec 2006 (pp 1318-1321)
  • Greenwood MA, Stevenson M & Gaizauskas RJ (2006) The University of Sheffield's TREC 2006 Q&A Experiments.. TREC, Vol. 500-272
  • Saggion H & Gaizauskas RJ (2006) Experiments in Passage Selection and Answer Identification for Question Answering.. FinTAL, Vol. 4139 (pp 291-302)
  • Setzer A, Gaizauskas RJ & Hepple M (2005) . Lang. Resour. Evaluation, Vol. 39 (pp 243-265)
  • Gaizauskas RJ, Greenwood MA, Harkema H, Hepple M, Saggion H & Sanka A (2005) The University of Sheffield's TREC 2005 Q&A Experiments.. TREC, Vol. 500-266
  • Saggion H & Gaizauskas RJ (2005) Experiments on Statistical and Pattern-Based Biographical Summarization.. EPIA, Vol. 3808 (pp 611-621)
  • Gaizauskas R, Hepple M, Saggion H, Greenwood MA & Humphreys K (2005) . Proceedings of the Ninth International Workshop on Parsing Technology - Parsing '05 (pp 200-201), 9 October 2005 - 10 October 2005.
  • Aswani N & Gaizauskas R (2005) . Proceedings of the ACL Workshop on Building and Using Parallel Texts - ParaText '05 (pp 115-115), 29 June 2005 - 30 June 2005.
  • Aswani N & Gaizauskas R (2005) . Proceedings of the ACL Workshop on Building and Using Parallel Texts - ParaText '05 (pp 57-57), 29 June 2005 - 30 June 2005.
  • Gaizauskas R, Hepple M, Saggion H, Greenwood MA & Humphreys K (2005) SUPPLE: A practical parser for natural language engineering applications. Iwpt 2005 Proceedings of the 9th International Workshop on Parsing Technologies (pp 200-201)
  • Saggion H, Barker E, Gaizauskas R & Foster J (2005) Integrating NLP tools to support information access to news archives. International Conference Recent Advances in Natural Language Processing Ranlp, Vol. 2005-January (pp 452-458)
  • Gaizauskas RJ, Greenwood MA, Hepple M, Roberts I & Saggion H (2004) The University of Sheffield's TREC 2004 QA Experiments.. TREC, Vol. 500-261
  • Saggion H & Gaizauskas RJ (2004) Mining On-line Sources for Definition Knowledge.. FLAIRS (pp 61-66)
  • Gaizauskas RJ, Hepple M & Greenwood MA (2004) . SIGIR Forum, Vol. 38 (pp 41-44)
  • Mitchell B & Gaizauskas RJ (2004) A Labelled Corpus for Prepositional Phrase Attachment.. LREC
  • Harkema H, Gaizauskas RJ, Hepple M, Davis N, Guo Y, Roberts A & Roberts I (2004) A Large-Scale Resource for Storing and Recognizing Technical Terminology.. LREC
  • Guo Y, Harkema H & Gaizauskas RJ (2004) Sheffield University and the TREC 2004 Genomics Track: Query Expansion Using Synonymous Terms.. TREC, Vol. 500-261
  • Pustejovsky J, Saurí R, Castaño JM, Radev DR, Gaizauskas RJ, Setzer A, Sundheim B & Katz G (2004) Representing Temporal and Event Knowledge for QA Systems.. New Directions in Question Answering (pp 99-112)
  • Gaizauskas RJ, Davis N, Demetriou G, Guo Y & Roberts I (2004) . IEEE SCC (pp 145-152)
  • Roberts I & Gaizauskas RJ (2004) Evaluating Passage Retrieval Approaches for Question Answering.. ECIR, Vol. 2997 (pp 72-84)
  • Moreau L, Miles S, Goble CA, Greenwood RM, Dialani V, Addis M, Alpdemir MN, Cawley R, Roure DD, Ferris J , Gaizauskas RJ et al (2003) On the Use of Agents in BioInformatics Grid.. CCGRID (pp 653-660)
  • Harmain HM & Gaizauskas RJ (2003) CM-Builder: A Natural Language-Based CASE Tool for Object-Oriented Analysis.. Autom. Softw. Eng., Vol. 10 (pp 157-181)
  • Gaizauskas RJ (2003) Recent Advances in Computational Terminology edited by Didier Bourigault, Christian Jacquemin, and Marie-Claude L'Homme.. Comput. Linguistics, Vol. 29 (pp 328-332)
  • Gaizauskas RJ, Greenwood MA, Hepple M, Roberts I, Saggion H & Sargaison M (2003) The University of Sheffield's TREC 2003 Q&A Experiments.. TREC, Vol. 500-255 (pp 782-790)
  • Pustejovsky J, Castaño JM, Ingria R, Saurí R, Gaizauskas RJ, Setzer A, Katz G & Radev DR (2003) TimeML: Robust Specification of Event and Temporal Expressions in Text.. New Directions in Question Answering (pp 28-34)
  • Ainsworth S, Clarke D & Gaizauskas RJ (2002) Using Edit Distance Algorithms to Compare Alternative Approaches to ITS Authoring.. Intelligent Tutoring Systems, Vol. 2363 (pp 873-882)
  • Mitchell B & Gaizauskas RJ (2002) A Comparison of Machine Learning Algorithms for Prepositional Phrase Attachment.. LREC
  • Baker P, Hardie A, McEnery T, Cunningham H & Gaizauskas RJ (2002) EMILLE, A 67-Million Word Corpus of Indic Languages: Data Collection, Mark-up and Harmonisation.. LREC
  • Clough PD, Gaizauskas RJ & Piao SSL (2002) Building and annotating a corpus for the study of journalistic text reuse.. LREC
  • Clough P, Gaizauskas R, Piao SSL & Wilks Y (2002) METER: MEasuring TExt Reuse. 40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE (pp 152-159)
  • Greenwood MA, Roberts I & Gaizauskas RJ (2002) The University of Sheffield TREC 2002 Q&A System.. TREC, Vol. 500-251
  • Demetriou G & Gaizauskas RJ (2002) Utilizing text mining results: The Pasta Web System.. ACL Workshop on Natural Language Processing in the Biomedical Domain (pp 77-84)
  • Oakes MP, Gaizauskas RJ & Fowkes H (2001) A Method Based on the Chi-Square Test for Document Classification.. SIGIR (pp 440-441)
  • Scott S & Gaizauskas RJ (2001) QA-LaSIE: A Natural Language Question Answering System.. AI, Vol. 2056 (pp 172-182)
  • Setzer A & Gaizauskas R (2001) . Proceedings of the workshop on Temporal and spatial information processing -, Vol. 13 (pp 1-8), 7 July 2001 - 7 July 2001.
  • Gaizauskas R, Herring P, Oakes M, Beaulieu M, Willett P, Fowkes H & Jonsson A (2001) . Proceedings of the first international conference on Human language technology research - HLT '01 (pp 1-5), 18 March 2001 - 21 March 2001.
  • Bontcheva K, Brewster C, Ciravegna F, Cunningham H, Guthrie L, Gaizauskas R & Wilks Y (2001) . Proceedings of the workshop on Human Language Technology and Knowledge Management -, Vol. 2001 (pp 1-8), 6 July 2001 - 7 July 2001.
  • Gaizauskas RJ, Rodgers PJ & Humphreys K (2001) . J. Vis. Lang. Comput., Vol. 12 (pp 375-412)
  • Setzer A & Gaizauskas RJ (2000) Annotating ¾Ã²Ý¸£Àû and Temporal Information in ¾Ã²Ý¸£Àûwire Texts.. LREC
  • Demetriou G & Gaizauskas RJ (2000) Automatically Augmenting Terminological Lexicons from Untagged Text.. LREC
  • Scott S & Gaizauskas RJ (2000) University of Sheffield TREC-9 Q&A System.. TREC, Vol. 500-249
  • Harmain HM & Gaizauskas RJ (2000) . ASE (pp 45-54)
  • Stevenson M & Gaizauskas RJ (2000) Experiments on Sentence Boundary Detection.. ANLP (pp 84-89)
  • Stevenson M & Gaizauskas RJ (2000) Using Corpus-derived Name Lists for Named Entity Recognition.. ANLP (pp 290-295)
  • Krotov A, Hepple M, Gaizauskas RJ & Wilks Y (1999) Compacting the Penn Treebank Grammar. CoRR, Vol. cs.CL/9902001
  • Azzam S, Humphreys K & Gaizauskas R (1999) . Proceedings of the Workshop on Coreference and its Applications - CorefApp '99 (pp 77-77), 22 June 1999 - 22 June 1999.
  • Azzam S, Humphreys K, Gaizauskas RJ & Wilks Y (1999) . Appl. Artif. Intell., Vol. 13 (pp 705-724)
  • Humphreys K, Gaizauskas RJ, Hepple M & Sanderson M (1999) University of Sheffield TREC-8 Q&A System.. TREC, Vol. 500-246
  • Gaizauskas RJ (1998) . Comput. Speech Lang., Vol. 12 (pp 249-262)
  • Azzam S, Humphreys K & Gaizauskas RJ (1998) Evaluating a Focus-Based Approach to Anaphora Resolution. CoRR, Vol. cmp-lg/9807001
  • Krotov A, Hepple M, Gaizauskas RJ & Wilks Y (1998) Compacting the Penn Treebank Grammar.. COLING-ACL (pp 699-703)
  • Azzam S, Humphreys K & Gaizauskas RJ (1998) Evaluating a Focus-Based Approach to Anaphora Resolution.. COLING-ACL (pp 74-78)
  • Gaizauskas RJ, Hepple M & Huyck CR (1998) A scheme for comparative evaluation of diverse parsing systems.. LREC (pp 143-152)
  • Humphreys K, Gaizauskas R, Azzam S, Huyck C, Mitchell B, Cunningham H & Wilks Y (1998) University of Sheffield: Description of the LaSIE-II system as used for MUC-7. 7th Message Understanding Conference Muc 1998 Proceedings
  • Gaizauskas RJ & Humphreys K (1997) Conception vs. Lexicons: An Architecture for Multilingual Information Extraction.. SCIE, Vol. 1299 (pp 28-43)
  • Rodgers P, Gaizauskas R, Humphreys K & Cunningham H (1997) Visual execution and data visualisation in natural language processing. 1997 IEEE SYMPOSIUM ON VISUAL LANGUAGES, PROCEEDINGS (pp 338-343)
  • Cunningham H, Humphreys K, Gaizauskas RJ & Wilks Y (1997) Software Infrastructure for Natural Language Processing. CoRR, Vol. cmp-lg/9702005
  • Humphreys K, Gaizauskas R & Azzam S (1997) . Proceedings of a Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts - ANARESOLUTION '97 (pp 75-81), 11 July 1997 - 11 July 1997.
  • Rodgers PJ, Gaizauskas RJ, Humphreys K & Cunningham H (1997) . VL (pp 342-347)
  • Gaizauskas RJ & Robertson AM (1997) Coupling information retrieval and information extraction: A new text technology for gathering information from the web.. RIAO (pp 356-370)
  • Robertson AM & Gaizauskas RJ (1997) On the Marriage of Information Retrieval and Information Extraction.. BCS-IRSG Annual Colloquium on IR Research
  • Cunningham H, Humphreys K, Gaizauskas RJ & Wilks Y (1997) GATE - a General Architecture for Text Engineering.. ANLP (pp 29-30)
  • Cunningham H, Humphreys K, Gaizauskas RJ & Wilks Y (1997) Software Infrastructure for Natural Language Processing.. ANLP (pp 237-244)
  • Gaizauskas R, Cunningham H, Wilks Y, Rodgers P & Humphreys K (1996) GATE: An environment to support research and development in natural language engineering. EIGHTH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS (pp 58-66)
  • Takemoto Y, Wakao T, Yamada H, Gaizauskas R & Wilks Y (1996) . Proceedings of a workshop on held at Vienna, Virginia May 6-8, 1996 - (pp 475-475), 6 May 1996 - 8 May 1996.
  • Cunningham H, Humphreys K, Gaizauskas R & Wilks Y (1996) . Proceedings of a workshop on held at Vienna, Virginia May 6-8, 1996 - (pp 121-121), 6 May 1996 - 8 May 1996.
  • Gaizauskas R & Humphreys K (1996) XI: A simple prolog-based language for cross-classification and inheritance. ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, APPLICATIONS, Vol. 35 (pp 86-95)
  • Wakao T, Gaizauskas RJ & Wilks Y (1996) Evaluation of an Algorithm for the Recognition and Classification of Proper Names.. COLING (pp 418-423)
  • Cunningham H, Wilks Y & Gaizauskas RJ (1996) GATE-a General Architecture for Text Engineering.. COLING (pp 1057-1060)
  • Gaizauskas RJ, Humphreys K, Cunningham H & Wilks Y (1995) University of Sheffield: description of the LaSIE system as used for MUC-6.. MUC (pp 207-220)
  • Gaizauskas RJ, Cahill LJ & Evans R (1993) Sussex University: description of the Sussex system used for MUC-5.. MUC (pp 321-335)
  • Gaizauskas RJ (1991) Deriving Answers to Logical Queries Via Answer Composition.. ALPUK (pp 112-134)
  • EVANS R, GAIZAUSKAS R & HARTLEY AF (1990) POETIC - THE PORTABLE EXTENDIBLE TRAFFIC INFORMATION COLLATOR. OECD WORKSHOP ON KNOWLEDGE-BASED EXPERT SYSTEMS IN TRANSPORTATION, VOL 1, Vol. 116 (pp 171-184)
  • Wang JK & Gaizauskas R () Cross-validating Image Description Datasets and Evaluation Metrics. Proceedings of the 10th Language Resources and Evaluation Conference (pp 3059-3066). Portorož, Slovenia, 23 May 2016 - 23 May 2016.
  • Gaizauskas RJ () Generating image descriptions using dependency relational patterns. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (pp 1250-1258). Uppsala, Sweden, 11 July 2010 - 16 July 2010.

Working papers

  • Crouch R, Gaizauskas R & Netter K (1996) Report of the Study Group on Assessment and Evaluation.

Preprints

  • Chan J, Gaizauskas R & Zhao Z (2024) RULEBREAKERS: Challenging LLMs at the Crossroads between Formal Logic and Human-like Reasoning.
  • Krotov A, Hepple M, Gaizauskas R & Wilks Y (1999) , arXiv.
  • Azzam S, Humphreys K & Gaizauskas R (1998) , arXiv.
  • Cunningham H, Humphreys K, Gaizauskas R & Wilks Y (1997) , arXiv.
  • Cunningham H, Wilks Y & Gaizauskas RJ (1996) , arXiv.
  • Cunningham H, Gaizauskas RJ & Wilks Y (1996) , arXiv.
Grants
  • , EPSRC, 04/2019 - 09/2027, £5,508,850, as Co-PI
  • A Multimodal Speech and Graphical Interface for Hands-free Data Capture and Querying in MRO: Connecting Workers to Enterprise Information Systems, EPSRC & Research England, 07/2019 - 03/2021, £85,009, as PI
  • Investigating Spoken Dialogue to Support Manufacturing Processes, ESPRC, 03/2017 - 06/2018, £63,502, as PI
  • SENSEI: Building the business case, The University of Sheffield,12/2016 - 03/2017, £8,541, as PI
  • Healtex: , EPSRC, 05/2016 -02/2020, £340,240, as Co-PI
  • SENSEI: , EC FP7, 11/2013 - 10/2016, £459,034, as PI
  • VisualSense: , EPSRC, 01/2013 - 06/2016, £310,677, as PI
  • , EPSRC, 06/2012 - 05/2015, £293,127, as Co-PI
  • TAAS: Terminology As A Service, EC FP7, 06/2012 - 05/2014, £268,032, as PI
  • ACCURAT: , EC FP7, 01/2010 - 06/2012, £353,265, as PI
  • , EPSRC, 02/2007 - 01/2010, £239,920, as Co-PI
  • Cronopath: Timeline and named entity extraction for hyperlink corpora, EPSRC, 07/2005 - 12/2007, £294,632, as Co-PI
  • Real-time Text Mining for the Biomedical Literature: a collaboration between DiscoveryNet & myGrid, EPSRC, 03/2005 - 02/2006, £56,588, as PI
  • CLEF-Services, MRC, 01/2005 - 06/2008, £430,221, as PI
  • VIKEF: Virtual Information and Knowledge Environment Framework, EC FP6, 04/2004 - 03/2007, £200,020, as PI
  • Electronic cub-reporter: automatically gathering and collating background information from digital text, EPSRC, 01/2003 - 06/2006, £307,973, as PI
  • MYGRID: Directly Supporting the E-Scientist, MRC, 10/2001 - 06/2005, £320,206, as PI
  • CLEF: Clinical E-Science Framework, MRC, 10/2002 - 01/2006, £280,725, as PI
  • CLARITY: , EC FP6, 02/2001 - 01/2004, £469,576, as PI
  • , EPSRC, 06/2000 - 09/2003, £35,859, as PI
Professional activities and memberships

Head of Natural Language Processing (NLP) research group