Total views : 265

Disambiguation Solution for Persons’ Accounts in Research Information Management Systems


  • Integrirovannye Sistemy, LLC, 19b1, Str. Presnensky val, Moscow, 123557, Russian Federation


Objectives: Personal identification in information systems, trivial task in software engineering when a passport number or email addresses are available, is a challenge when users’ data cannot be stored or an email has been changed or shared. In particularly, it is a challenge in research management systems that aggregate recent academic articles, patents, data sets, and other findings and provide a link to their authors. Methods/Analysis: To achieve the project objectives, the authors have reviewed corresponding solutions providing the classification for author disambiguation methods and using the comparative analysis. In addition, they have calculated sensitivity for the process of person identification to the estimated amount of manual work on disambiguation. Findings: This paper proposes the approach and algorithms, which together are a solution for the personal identification process to ensure duplicate elimination. Novelty of the Study: The approach is suitable for systems with shared emails or user accounts.


Author Disambiguation, Author Identifier, Current Research Information Systems (CRIS), Duplicate Detection, Identity Management, Master Data Management (MDM), Person Disambiguation, Research Information Management, User Accounting.

Full Text:

 |  (PDF views: 176)


  • Scopus Abstract and Citation Database of Peer-reviewed Literature. Date accessed: 02/07/2016 Available from:,
  • MEDLINE. Journal Citations and Abstracts Database for Biomedical Literature. Date accessed: 10/07/2016. Available from:
  • eRA Commons. A Program of National Institutes of Health. Date accessed: 02/07/2016. Available from:
  • Joint N. Current Research Information Systems, Open Access Repositories and Libraries: ANTAEUS. Library Review. 2008; 57(8):57075.
  • Euro CRIS. Why does one need a CRIS? Date accessed: 15/09/2016. Available from:
  • FCNTP SSTP System. Short Guidelines on Registration Procedure and Work in System. Date accessed: 15/09/2016. Available from:
  • Xiaoxin Y, Jiawei H, Yu PS. Object Distinction: Distinguishing Objects with Identical Names. Proceedings of International Conference on Data Engineering, 2007, p. 124146.
  • Torvik VI, Weeber M, Swanson DR, Smalheiser NR. A Probabilistic Similarity Metric for MEDLINE Records: A Model for Author Name Disambiguation, 2003.
  • Shaun J. Grannis, Overhage JD, McDonald CJ. Analysis of identifier performance using a deterministic linkage algorithm. Proc. AMIA Symp., 2002, p. 30509.
  • Weiler H. Authormagic A Concept for Author Disambiguation in Large-Scale Digital Libraries, 2012.
  • Vetel I. Torvik. Author Name Disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data. 2009; 3(3).
  • Soler JM. Separating the Articles of Authors with the Same Name. Scientometrics. 2007; 72(2):28190.
  • Afonin SA, et al. ISTINA Intelligent System for Subject Research of Scientific and Technical Information). Moscow University Press, 2014.
  • De Carvalho AP, Ferreira AA, Laender AHF, Gonçalves MA. Incremental Unsupervised Name Disambiguation in Cleaned Digital Libraries. Journal of Information and Data Management. 2011; 2(573871):289.
  • Li Y, Wen A, Lin Q, Li R, Lu Z, Wang H, Qian T. Incorporating User Feedback into Name Disambiguation of Scientific Cooperation Network. Web-Age Information Management. Lecture Notes in Computer Science. 2011; 6897:45466.
  • Elliott S. Survey of Author Name Disambiguation: 2004 to 2010. Library Philosophy and Practice. Date accessed: 15/09/2016. Available from:
  • Mazov N, Gureev V. Problems of Identification of Metadata in Scientometric Databases WoK, Scopus and Russian SCI as Exemplified by Authors’ Profiles. Libraries and Information Resources in the Modern World of Science, Culture, Education, and Business. Proceedings 19th Anniversary International Conference ‘Crimea, 2012, p. 14.
  • Chadegani AA, Salehi H, Yunus Md. MM, Farhadi H, Fooladi M, Farhadi M, Ale Ebrahim N. A Comparison between two Main Academic Literature Collections: Web of Science and Scopus Databases. Asian Social Science. 2013; 9(5):1826.
  • Cucerzan S. Large-scale Named Entity Disambiguation Based on Wikipedia Data. Proceedings of 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2007 Jun, p. 70816.
  • Delgado AD, Martínez R, Fresno V, Montalvo S. An Unsupervised Algorithm for Person Name Disambiguation in the Web. Procesamiento de Lenguaje Natural, 2014; 53:518.
  • Scopus Feedback. Use the Scopus Author Feedback Wizard to Collect All Your Scopus Records in One Unique Author Profile. Date accessed: 10/07/2016. Available from:
  • Feedback and Support for ORCID. What if I Have Two ORCID IDs? Date accessed: 15/09/2016. Available from:
  • Zendesk. Help Desk Software and Ticket Management System. Date accessed: 15/09/2016. Available from:
  • Haak L. ORCID. Managing Duplicate ORCID iDs. Date accessed: 15/09/2016. Available from:
  • eLIBRARY. Russian Science Citation Index. Date accessed: 01/07/2016. Available from:
  • Facebook. I Have Two Accounts. Can I Merge Them? Date accessed: 15/09/2016. Available from:
  • Salesforce. Guidelines and Considerations for Merging Duplicate Accounts. Date accessed: 15/09/2016. Available from:
  • Intuit. Merging Duplicate Employees in QuickBooks. Date accessed: 15/09/2016. Available from:
  • Hurst Ph. The Royal Society. From January You’ll Need an ORCID. 2015 Dec 7. Date accessed: 15/09/2016. Available from:
  • Taylor & Francis Group. Trialling ORCID: What, Why and How. Date accessed: 15/09/2016. Available from:
  • Sogani D. Understanding Master Data Management (MDM). Date accessed: 10/07/2016. Available from:
  • Verma A, Kaur I, Arora N. Comparative Analysis of Information Extraction Techniques for Data Mining. Indian Journal of Science and Technology. 2016; 9(11):118.
  • Jeong YS. Parallel Processing Scheme for Minimizing Computational and Communication Cost of Bioinformatics Data. Indian Journal of Science and Technology. 2015; 8(15):18.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.