A Technical Note on Bibliometric Analysis by Biblioshiny and VOSviewer

Himel Mondal

doi:10.1055/s-0045-1810060

Indian Journal of Radiology and Imaging, Table of Contents

CC BY-NC-ND 4.0 · Indian J Radiol Imaging
DOI: 10.1055/s-0045-1810060

Review Article

A Technical Note on Bibliometric Analysis by Biblioshiny and VOSviewer

Authors

Himel Mondal

¹Department of Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, India

Abstract

Full Text

PDF Download

Keywords

bibliometrics - librarians - PubMed - data collection - Biblioshiny - VOSviewer

Introduction

In the rapidly evolving landscape of scientific research, bibliometric analysis has become an indispensable tool for understanding the structure, impact, and development of academic knowledge.[1] With the continuous expansion of global research output, traditional methods of literature review and academic assessment often fall short in providing comprehensive insights. Bibliometric analysis, through its systematic approach, enables researchers, policymakers, and institutions to evaluate the progression of research topics.[2]

Whether analyzing the evolution of a research topic, measuring institutional research output, assessing journal impact, or evaluating an author's contribution, bibliometric methods provide objective and reproducible insights.[3] There are databases that provide the analysis on their Web sites. For example, in Web of Science (WOS), insights about citations, top authors, or institutions can be obtained in some clicks. In addition to WOS, Scopus also provides bibliometric analytics on their Web site.[4] However, these are subscription-based databases and many resource-limited settings may face difficulty in accessing these databases.[5] In contrast, PubMed is a freely accessible database for biomedical scientific literature.[6]

With this background, this article provides a brief technical guide on how to conduct a bibliometric analysis for a topic, an institution, a journal, and an author from the data obtained from PubMed and analysis done in Biblioshiny and VOSviewer.[7]

Materials and Methods

Tools Required

For bibliometric analysis, researchers need two basic tools—Biblioshiny and VOSviewer. The method about installation of these software is described for a Windows PC. Steps to use those in an Apple's Macintosh computers are almost similar.

Biblioshiny: To run Biblioshiny, researchers need to install R and Rstudio Desktop. R is a powerful language and environment specifically designed for statistical computing and graphics. It offers a comprehensive suite of statistical and graphical techniques, making it widely used among statisticians, data analysts, and researchers.[8] Rstudio is an integrated development environment designed specifically for R, aimed at enhancing the experience of statistical computing and data analysis.[9] [Fig. 1] shows a page on Rstudio. It provides a user-friendly interface that includes features such as a script editor, console, workspace viewer, and tools for plotting and debugging, making it easier to write and manage R code.

Fig. 1 Screenshot of Rstudio software showing the console on the left lower side where codes are written and packages tab on the right lower side from where various packages can be downloaded.

Biblioshiny is a Web-based application that provides a user-friendly graphical interface for performing bibliometric analysis using the Bibliometrix R package.[10] Designed to simplify complex bibliometric workflows, Biblioshiny allows users to import, analyze, and visualize scientific publication data without the need for coding.[7] It supports a wide range of data formats from major databases like Scopus, WOS, and PubMed.

For accessing Biblioshiny, researchers need a Web browser (e.g., Firefox, Google Chrome, Edge). Hence, to run the Biblioshiny, they should keep the browser installed on the computer, R package installed, and Rstudio installed.

To run Bibliometrix package, researchers should open the Rstudio and install the bibiometrix package with the following codes:

Code to install: install.packages(“bibliometrix”)
Code to load the package: library(bibliometrix)
Code to launch biblioshiny: biblioshiny()

This will open the Biblioshiny Web interface on the default Internet or Web browser. The software is now ready for analysis. Researchers need to keep the Rstudio running on the background while analyzing data on the Web browser.

VOSviewer: VOSviewer is a free software tool designed for constructing and visualizing bibliometric networks.[11] These networks can be of coauthorship, keyword cooccurrence, or many others.[12] One of the key strengths of VOSviewer lies in its ability to handle large data sets and present complex relationships in an intuitive and visually appealing manner. The software provides various layout and clustering techniques to identify patterns and groupings within the data, making it especially useful for mapping the intellectual structure of research fields. This tool has also been explored for text analysis.[13] Anyone can download and use it without installing it in the system. However, the computer needs Java 8 or higher version to be installed on the computer to run the software. Java is a high-level, object-oriented programming language known for its platform independence and robustness. It is widely used for building Web applications, mobile apps, enterprise software, and bioinformatics.[14] It can be downloaded free from their Web site. As VOSviewer is a standalone Java application, simply clicking on the VOSviewer application file (e.g., VOSviewer.exe in Windows computer) will open it.

Data Collection from PubMed

To collect data from PubMed for a bibliometric analysis, researchers typically begin by formulating a well-defined search strategy using relevant keywords. Researchers can plan the keywords according to commonly used words and phrases and Medical Education Subject Headings terms.[15] Boolean operators (e.g., AND, OR) are used for combining search terms in PubMed.[16] Researchers should remember that this is a crucial step and any error or weakness in search strategy will give erroneous analysis. Hence, after multiple piloting and consensus among the team members, search strategy should be finalized.

On the search results page, researcher need to click on the “Save” button to expand the available options ([Fig. 2]) and then select “All results” to save all the results, or choose to save only the results displayed on the current page or only selected items. There are four options for saving the data: Summary (text), PubMed, PubMed Identifier (PMID),[17] Abstract (text), and comma separated value (CSV). For using the data in Biblioshiny and VOSviewer, the PubMed format is required. However, for screening studies, the CSV format is more suitable. Therefore, researchers should save the files in both formats and store for further use. On clicking the “Create file” button, the file will be saved on the computer.

Fig. 2 A screenshot showing a PubMed search result page where the user is saving “all results” in “PubMed format” by clicking the “create file” button.

For screening, researchers should open the CSV file using spreadsheet software such as Microsoft Excel and the PubMed format file (a text file and can be opened with any text editor) in Microsoft Notepad. They can review the titles of the saved results in this Excel file and mark studies that should be excluded from the analysis. To remove a specific study, researchers can copy its PMID and search (Ctrl + F) it in PubMed format file. The data for each study typically starts with the PMID and ends with the SO (which stands for source or reference). To delete a particular study, researchers should select and remove the content from the PMID to the SO of that study.

Data Analysis

Biblioshiny: To perform an analysis in Biblioshiny, researchers needs to open Rstudio and enter the commands “library(bibliometrix)” followed by “biblioshiny().” This will launch the Biblioshiny Web interface in default internet browser. Researchers should click on the “Data” tab and then select “Import or Load.” Then, they should look on the right side of the window, choose “Import raw file(s),” set the database as “PubMed,” and keep the author's name format as “surname and initials.” Next, they need to click the “Browse” button to select the file for analysis ([Fig. 3]). After selecting the downloaded PubMed format file and allowing it to upload, clicking on the “Start” button will load the file into Biblioshiny. A window will display the data quality; and researchers need to click “Save.” After a few seconds, a tick mark will confirm successful data loading. Then, the window can be “close (d)” and the data is ready for analysis.

Fig. 3 A screenshot of Biblioshiny where data import button (Import or Load) is showing on the left side and import options (file types, database, author name format) are showing on the right side of the image.

Researchers can apply filters using the options on the left side of the screen (see [Fig. 4]), such as filtering by year or language. To run an analysis, they need to click on the relevant analysis button on the left panel. In the top right corner of the results section ([Fig. 4]), there are three buttons—play (to run analysis), plus (to add for exporting), and download (to save the image). If an analysis does not appear immediately, clicking on the play button and waiting a few moments will do. To export the results into an Excel file, researchers need to click on the plus (+) button and to save images, use the download button (↓). After completing the analysis and selecting the desired reports to export, clicking on the “Report” button will navigate to the report page. From here, researchers can click on the “Export Report” button and reports will be saved as a spreadsheet file (Excel format).

Fig. 4 A screenshot of Biblioshiny where the most relevant sources (i.e., journals) under sources of an analysis is shown.

Many researchers may encounter an issue with Biblioshiny where after loading the data, the browser page becomes unresponsive and further analysis cannot be performed. In such cases, the problem can usually be resolved by installing the Chromote package, which provides a headless Chrome Web browser interface. To do this, researchers needs to type the following command in the Rstudio console: install.packages(“chromote”). This will install the required dependency, and the issue should be resolved.

VOSviewer: As VOSviewer does not require installation, researchers need to run the software by clicking on “VOSviewer.exe” file. On the landing page, researchers should click on “Create” from the File tab and select “Create a map based on bibliographic data” and click “Next.” As the file sourced from a database, researchers need to choose “Read data from bibliographic database files” and proceed by clicking “Next.” Then, they need to click on the PubMed tab, select the file saved on computer disk, and click “Next” to choose the type of analysis and counting method.

At this stage ([Fig. 5]), researchers need to select either “Co-authorship” (to analyze author collaboration networks) or “Co-occurrence” (to analyze keyword networks). For example, to visualize a coauthorship network, they need to select “Co-authorship,” choose the counting method as “Full counting,” and optionally check the box to “Ignore documents with a large number of authors,” specifying the desired threshold.

Fig. 5 A screenshot of VOSviewer showing the “create” button on the left side and pop-up window for choice filling on the type of analysis, unit of analysis, and counting method.

Clicking on “Next” will open a screen where researchers can define inclusion criteria based on the minimum number of documents per author. Another “Next” click will display a list of authors, along with the number of their documents and link strength. Finally, clicking the “Finish” button will generate the network visualization ([Fig. 6]). Researchers can switch between different visualization types like “Network Visualization,” “Overlay Visualization,” and “Density Visualization” by using the tabs provided. To save the network for future use, one should click on the “Screenshot” button. A pop-up window will appear, that will allow saving the network in various formats such as JPG, PDF, TIFF, and others.

Fig. 6 A screenshot of VOSviewer showing the coauthorship network, visualization tab (on the upper portion), and options to modify the visualizations (scale, size variation, background, etc.) on the right side.

Data of Author, Institution, Journal, and Topic

The steps to conduct bibliometric analysis using Biblioshiny and VOSviewer are summarized in [Table 1]. Neither of the two software tools provides dedicated options to analyze data by topic, institution, journal, or author. Instead, the analysis depends entirely on the input data supplied by the researchers. Therefore, researchers must collect and curate the specific data they wish to analyze. The following section briefly describes the data collection methods for analyzing research by topic, institution, journal, or author.

Table 1
Steps of bibliometric analysis at a glance
Step	Action	Brief
1	Define objective	Choose analysis focus (topic, author, journal, or institution)
2	Formulate search strategy	Use PubMed to search published articles. Authors can search other database also, if they have access
3	Save data from PubMed	Save the search result in both PubMed format and CSV
4	Clean and curate data	Use Excel for screening and Notepad to remove entries using PMIDs
5	Set up Biblioshiny	Install R, Rstudio, and Bibliometrix package
6	Set up VOSviewer	Ensure Java is installed; download VOSviewer. No need to install
7	Analyze with Biblioshiny	Launch with biblioshiny() in Rstudio, it will open interface in Web browser, import PubMed format file, run analysis
8	Visualize with VOSviewer	Click on “Create” from the file tab, “Create a map based on bibliographic data” and follow next to analysis with customization of thresholds
9	Save visual outputs	Save visual output (JPG, PDF, etc.) from Biblioshiny and VOSviewer for further use
10	Interpret and report	Analyze patterns and trends; include graphs and metrics in reports

Abbreviations: CSV, comma separated value; PMID, PubMed Identifier.

For example, if researchers want to search for a particular author, such as “Sudip Bhattacharya,” they can simply enter the name in the PubMed search bar without using any field tags. However, a more targeted search would look like: Sudip Bhattacharya[Author]. Here, the author in square bracket is the PubMed tag for searching author.[18]

To search for publications from a particular institution, researchers should first identify the different formats in which the institution's name is commonly cited. For example, to retrieve articles from All India Institute of Medical Sciences, Deoghar, the following query can be used: ((All India Institute of Medical Sciences[Affiliation]) OR (AIIMS[Affiliation])) AND (Deoghar[Affiliation]). Here, the full form and abbreviation of institution was used with OR and the city name with AND Boolean.

Similarly, researchers should use both the full title and the abbreviated form of a journal for bibliometric analysis of a journal. For example, to analyze the journal Indian Journal of Radiology and Imaging, they need to use: “Indian Journal of Radiology and Imaging” OR “Indian J Radiol Imaging”[Journal]. Till date, this journal has a total of 1,647 articles in PubMed starting from 2008 (48 articles) to 2025 (55 articles, till April 23, 2025). The data can be saved as described in [Fig. 2].

For topic-based searches, if keywords are already framed, they can be directly used in PubMed. For instance, to find papers on actigraphy-based research in diabetes, restricted to title and abstract ([tiab] is tag for title and abstract), the following can be used: “actigraphy”[tiab] AND “diabetes”[tiab]. If the keywords and tags are not yet formulated, researchers can use the “Advanced” option to build a search string step-by-step ([Fig. 2]).[19]

Once the data is collected, the analysis process follows the same steps as described earlier. However, certain limitations may apply depending on the nature of the data set. For example, if data from a single journal is analyzed in Biblioshiny, the source (i.e., journal) field will contain only one entry. Similarly, if data for a single author is analyzed, that author will dominate metrics such as “most relevant author.”

Discussion

Searching bibliometric data on an author, institution, journal, and topic is essential for gaining a comprehensive understanding of research trends, productivity, and impact of the author, institution, journal, and research field.[20] An author-level analysis highlights key contributors, their collaborations, research field, and productivity over time. Commonly, the author's impact is determined by indices like the h-index, i-10 index, or g-index.[21] However, more holistic data like collaboration with authors and countries, research focus, working on trending topics, and other analytics will provide more comprehensive data about an author. This data may help in assessing the research output of a researcher. Analyzing data by institution reveals leading research trends of centers and the authors who are contributing to the institutions with higher numbers of publications, journals where articles are published, and other metrics. This may help the yearly research output report of an institution.[22] Examining journal-level metrics is commonly not required by the editors as many of the publishers provide the report along with their packages. However, there are many journals that are using open-source publishing platforms or managing a journal with limited manpower that may not manage to generate the bibliometrics frequently to assess the growth. For them, the methods described in this article would help. And the analysis will help to analyze more insights than only the citation counts.[23] Topic-based searches help identify research hotspots, emerging themes, and gaps in knowledge. Many of the authors are now conducting bibliometric analyses and publishing them for continuing scientific discourse on a particular topic.[24]

Beyond PubMed, several other bibliographic databases are widely used in bibliometric research, offering broader or more specialized coverage depending on the research question. Scopus (by Elsevier) and WOS (by Clarivate Analytics) are the most prominent among them, known for their comprehensive indexing across disciplines, including medicine, social sciences, and engineering.[25] These databases provide advanced filtering options, citation tracking, and citation metrics such as the h-index, which are crucial for in-depth bibliometric assessments. Dimensions.ai, a newer entrant, is gaining popularity due to its integration of grants, clinical trials, patents, and policy documents alongside traditional publications, offering a more holistic research landscape.[26]

In addition, a variety of specialized tools support bibliometric analysis and visualization. CiteSpace is a platform for identifying research trends and knowledge evolution over time.[27] Meanwhile, commercial platforms like InCites (by Clarivate)[28] and SciVal (by Elsevier)[29] offer institution-level metrics and benchmarking tools but require subscriptions. Each tool and database has its own strengths, and selecting among them depends on the research objective, available resources, and desired depth of analysis.

The novelty of this article lies in its hands-on, beginner-friendly approach to bibliometric analysis using entirely free and open-access resources. Unlike most existing tutorials that assume prior expertise or rely on subscription-based databases, this guide was designed specifically to support researchers in resource-constrained environments. However, a key limitation is the exclusive reliance on a single database (PubMed),[30] which may lead to incomplete coverage of interdisciplinary topics or articles indexed in other major bibliographic sources like Scopus or WOS. Additionally, some advanced functionalities available in commercial databases, like citation counts and top articles are not addressed here, making this guide more suitable for basic to intermediate bibliometric analysis.

Web Sites

R: https://www.r-project.org
Rstudio Desktop: https://www.posit.co
Google Chrome: https://www.google.com/chrome
Java: https://www.java.com/en
VOSviewer: https://www.vosviewer.com
PubMed: https://pubmed.ncbi.nlm.nih.gov

Conclusion

In this article, a practical overview was provided for conducting bibliometric analysis using free PubMed data. The use of two accessible tools—Biblioshiny and VOSviewer—was demonstrated to analyze data by topic, institution, journal, and author. Steps for data extraction, cleaning, and visualization were described to support reproducibility. The process was simplified and made accessible to researchers in resource-limited settings. By using open-source tools and free databases, insights into research trends and collaborations can be obtained without reliance on costly platforms.

References

References
1 Ellegaard O, Wallin JA. The bibliometric analysis of scholarly production: how great is the impact?. Scientometrics 2015; 105 (03) 1809-1831
2 Passas I. Bibliometric analysis: the main steps. Encyclopedia 2024; 4 (02) 1014-1025
3 Donthu N, Kumar S, Mukherjee D, Pandey N, Lim WM. How to conduct a bibliometric analysis: an overview and guidelines. J Bus Res 2021; 133: 285-296
4 AlRyalat SAS, Malkawi LW, Momani SM. Comparing bibliometric analysis using PubMed, Scopus, and Web of Science databases. J Vis Exp 2019; (152)
5 Pranckutė R. Web of Science (WoS) and Scopus: the titans of bibliographic information in today's academic world. Publications 2021; 9 (01) 12
6 Jin Q, Leaman R, Lu Z. PubMed and beyond: biomedical literature search in the age of artificial intelligence. EBioMedicine 2024; 100: 104988
7 Arruda H, Silva ER, Lessa M, Proença Jr D, Bartholo R. VOSviewer and Bibliometrix. J Med Libr Assoc 2022; 110 (03) 392-395
8 Hackenberger BK. R software: unfriendly but probably the best. Croat Med J 2020; 61 (01) 66-68
9 Shimizu I, Ferreira JC. Losing your fear of using R for statistical analysis. J Bras Pneumol 2023; 49 (03) e20230212
10 Aria M, Cuccurullo C. bibliometrix: An R-tool for comprehensive science mapping analysis. J Informetrics 2017; 11 (04) 959-975
11 VOSviewer. . Download. VOSviewer. Accessed April 21, 2025 at: https://www.vosviewer.com/download/
12 van Eck NJ, Waltman L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics 2010; 84 (02) 523-538
13 Bukar UA, Sayeed MS, Razak SFA, Yogarayan S, Amodu OA, Mahmood RAR. A method for analyzing text using VOSviewer. MethodsX 2023; 11: 102339
14 Fourment M, Gillings MR. A comparison of common programming languages used in bioinformatics. BMC Bioinformatics 2008; 9: 82
15 Mondal H, Mondal S, Mondal S. How to choose title and keywords for manuscript according to Medical Subject Headings. Indian J Vasc Endovasc Surg 2018; 5 (03) 141
16 Jha R, Sondhi V, Vasudevan B. Literature search: simple rules for confronting the unknown . Med J Armed Forces India 2022; 78 (Suppl. 01) S14-S23
17 Islamaj Dogan R, Murray GC, Névéol A, Lu Z. Understanding PubMed user search behavior through log analysis. Database (Oxford) 2009; 2009: bap018
18 Help. PubMed. Accessed June 12, 2025 at: https://pubmed.ncbi.nlm.nih.gov/help/
19 Sood A, Erwin PJ, Ebbert JO. Using advanced search tools on PubMed for citation retrieval. Mayo Clin Proc 2004; 79 (10) 1295-1299 , quiz 1300
20 Kumar R. Bibliometric analysis: comprehensive insights into tools, techniques, applications, and solutions for research excellence. Spectrum Engineer Management Sci 2025; 3 (01) 45-62
21 Mondal H, Deepak KK, Gupta M, Kumar R. The h-Index: understanding its predictors, significance, and criticism. J Family Med Prim Care 2023; 12 (11) 2531-2537
22 Cabezas-Clavijo A, Torres-Salinas D. Bibliometric reports for institutions: best practices in a responsible metrics scenario. Front Res Metr Anal 2021; 6: 696470
23 Singh HP. Alternative research bibliometrics: it's about quality and not quantity. Shoulder Elbow 2022; 14 (02) 121-122
24 Lim WM, Kumar S. Guidelines for interpreting the results of bibliometric analysis: a sensemaking approach. Glob Bus Organ Excell 2024; 43 (02) 17-26
25 Cascajares M, Alcayde A, Salmerón-Manzano E, Manzano-Agugliaro F. The bibliometric literature on Scopus and WoS: the medicine and environmental sciences categories as case of study. Int J Environ Res Public Health 2021; 18 (11) 5851
26 Hook DW, Porter SJ, Draux H, Herzog CT. Real-time bibliometrics: dimensions as a resource for analyzing aspects of COVID-19. Front Res Metr Anal 2021; 5: 595299
27 Chen C. CiteSpace II: detecting and visualizing emerging trends and transient patterns in scientific literature. J Am Soc Inf Sci Technol 2006; 57 (03) 359-377
28 InCites. Research. Accessed April 23, 2025 at: https://research.fiu.edu/cyberinfrastructure/applications/incites/
29 Dresbeck R. SciVal. J Med Libr Assoc 2015; 103 (03) 164-166
30 Misra DP, Ravindran V. An overview of the functionalities of PubMed. J R Coll Physicians Edinb 2022; 52 (01) 8-9

Figures