Visual Dynamics is an open-source tool that accelerates implementations and learning in molecular dynamics simulation using Gromacs. The presented protocol will guide you through the steps to perform a protein-ligand simulation prepared in ACPYPE with ease and general steps to other simulation models.
Visual Dynamics (VD) is a web tool that aims to facilitate the use and application of Molecular Dynamics (MD) executed in Gromacs, allowing users without computational familiarity to run short-time simulations for validation, demonstration, and teaching purposes. It is true that quantum methods are the most accurate. However, there is currently no computational feasibility to carry out the experiments that MD performs. The tool described here has continuously received improvements over the course of the last couple of years. This protocol will describe what is needed to run a simulation in VD with a protein-ligand complex previously prepared in ACPYPE and some general directions on the other simulation models available. For the detailed simulation, the FK506-binding protein from Plasmodium vivax complexed with the inhibitor D5 (PDB ID: 4mgv) will be used, and all files used will be provided. Note that this protocol will tell every option to be used to achieve the same results presented, but these options are not necessarily the only ones available.
According to the IUPAC definition, MD is the simulation procedure that consists of computing the motion of atoms in a molecule or of individual atoms or molecules in solids, liquids, and gases, according to Newton's laws of motion. The forces acting on atoms, necessary to simulate their motion, are commonly calculated using force fields from molecular mechanics1. It can be applied to any phenomenon that seeks to extract information at a molecular and often atomic level2.
MD is one of the techniques incorporated into bioinformatics, specifically structural bioinformatics. With it, it is possible to obtain kinetic and thermodynamic characteristics of biomolecular structures. For example, macromolecular stability, identification of allosteric sites, elucidation of mechanisms of enzymatic activity, molecular recognition and properties of complexes with small molecules, association between proteins, protein folding, and its hydration3. Furthermore, MD enables a wide range of studies, including molecular design (widely used in drug design), in determining the structure and its refinement (X-ray, NMR, and protein modeling)3. The results obtained at the end of an MD are the richest and most complete in terms of non-quantum simulation4.Classical MD is much more efficient than might be expected from a full consideration of the physics of biomolecular systems due to the number of substantial approximations. Notably, quantum dynamical effects are usually ignored3. However, implementing an MD experiment is not trivial5. It requires knowledge of computing, especially the Linux Terminal, as most structural bioinformatics software is made for it. Even with that knowledge, learning Gromacs commands and parametrization is another steep learning curve.
Since its first application to biology in 19776, much has evolved due to increased computational processing and improved coding. More than two decades ago, the first MD software intended for biological problems was launched, namely Gromacs7, AMBER8, and NAMD9.
Since their first versions, these software still remain the most used and cited. However, they continue with the same common implementation difficulties that plague researchers who are not computer specialists5. Some have complex installation and configuration steps, sometimes requiring extensive knowledge about the hardware it will run on to get the most out of it and highly computer-centric technical documentation. An easier way to interface with them, aside from the command line and infinite parameters, is needed.
An interface acts as an intermediary between the logical process to be performed and the human10. The paradigm of how software is executed has evolved as computing resources have improved. The first digital paradigm was the command line interfaces (CLI) followed by the evolution to the known graphical user interfaces (GUI)11. Following the evolutionary cycle, the interface produced by the World Wide Web (or simply WEB) is considered an evolution of GUIs11. These three paradigms currently co-exist depending on developers. CLI applications use textual commands on the operating system console. GUI applications, also called graphical desktops, use a graphical interface made up of windows, buttons, and other components. It is specific and pre-programmed for an operating system. The main difference from the CLI is the use of the mouse as an additional element in human-machine interaction12. WEB applications, despite being confused with a GUI, are more complex to develop but are more versatile and by far the most agile in operation. Furthermore, they only depend on an interpreter software called a browser, which makes it possible for the client application to communicate with the server through a network independent of the operating system13.
Structural bioinformatics software most commonly use CLI and GUI paradigms. Some examples of classic software that use CLI are Modeller14 for similarity modeling, Autodock15 for molecular docking, and Gromacs16 for molecular dynamics. Examples of software that adopt the GUI type are SwissPDBviewer17, Pymol18, VMD19, UCSF Chimera20, Autodock tools15, PyRx21, Biovia22, Maestro23, and Moe24, among others.
With the emergence of Hypertext Markup Language version 5 (HTML5)25, Cascading Style Sheets (CSS)26, and Javascript27 technologies, among others, many structural bioinformatics applications could be brought to the WEB, thus becoming more accessible. Examples of similarity modeling WEB servers are MODWEB28, which uses Modeller14 as a back-end and Swissmodel29. Examples of web application servers for molecular docking are Haddock30, Swissdock31, Cluspro32, Dockthor33, and others.
While structural analysis, modeling, and docking methodologies evolved from CLI paradigms to GUI and finally to WEB, MD continues to be mostly supported by command line execution (CLI type). Some good initiatives have emerged to improve this panorama. Examples of these initiatives are the implementation of plugins in existing software, such as QwikMD plugin to VMD34, GROMACS Plugin to PyMOL, and the Molecular Dynamics Simulation option in UCSF Chimera20, some new and easier CLI applications, such as ASGARD35, Gmx_qk36, and CHAPERONg37, and a robust web platform, BioBB-Wfs38. Although the use of these plugins and applications is an advance, their implementation is still a challenge for most unskilled researchers. Common difficulties include problems installing and configuring the MD software, which often compromise the full execution of the simulation5.
In 2022, the Visual Dynamics software for web-based computational simulation was made available by the Laboratório de Bioinformática e Química Medicinal at Fiocruz Rondônia39. Its initial version was built in Python and Flask, allowing simulations of systems with free proteins (apoenzymes) for only 2 ns. Subsequently, it was enhanced to include an automated simulation version with ligands prepared using PRODRG40.
VD was built to assist all researchers in the field of structural biophysics, biotechnology, and related areas who have limitations in computational knowledge; the tool allows these researchers to test their hypotheses involving MD simulations from any operational system and without access to a high-performance computer (HPC). The purpose of this work is to present the new features of Visual Dynamics version 3.0. Additionally, it aims to introduce an updated usage protocol for the tool and highlight the limitations to be addressed in the future, along with usage statistics up to the present moment (Figure 1).
1. Accessing the software and new user registration
2. Apoenzyme simulation submission
3. Submission of Simulation of Enzyme Complexed with Ligand Prepared in ACPYPE
4. Accessing the simulation results
VD provides a fully autonomous simulation execution that does not require user intervention or user-provided computational resources. After submitting a simulation to execution, the user can leave it, turn off their machines, and the simulation will continue running. It also allows users to access the results from any device, be it a laptop or mobile device.
As an example of using VD in automated mode through the WEB, the test was made for a protein-ligand complex prepared in ACPYPE using the structure of the FK506-binding protein from Plasmodium vivax complexed with the inhibitor D5 (PDB ID: 4mgv)41. The preparation followed the described protocol, and the results of the analysis are shown in Figure 2A–D.
Figure 2A represents the Root Mean Square Deviation (RMSD) between the initial protein structure and its simulation over the course of 5 ns (as fixed in the system). The protein backbone exhibited an RMSD of less than 2.5 Å throughout the simulation. Figure 2B displays the Radius of Gyration (Rg), which describes the protein's compactness during the 5 ns simulation. This graph shows Rg in the three coordinates, x, y, and z, as well as the overall value. Figure 2C illustrates the Root Mean Square Fluctuation (RMSF) that represents the average fluctuation distance of each amino acid in the protein structure during the 5 ns simulation. Figure 2D depicts the energy variation in kJ/mol for the system during the energy minimization process using the steepest descent method. From this graph, we observe that the system stabilized with a maximum force of less than 1000 kJ/mol/nm.
Another use case is when the user wants to run it on their own server with Gromacs installed. This form of usage was not covered in this protocol as it requires a moderate level of knowledge about the Linux Terminal and CLI applications. This method uses VD as a generator of modifiable Gromacs commands that achieve the same result as VD when executed as generated. They download the MDP files and the file containing the generated commands (Figure 3A). Everything is available in VD. When they execute it in the Linux terminal, they will get results like Figure 3B.
Figure 1: Using VD to generate command scripts. (A) List of commands generated in the VD when the command list download option is chosen. (B) Example of output from executing a command. Just copy, paste, and run in the local Linux prompt. Please click here to view a larger version of this figure.
Figure 2: Visual Dynamics access and usage statistics. More than 4 thousand single IP users from around 63 countries have already accessed VD. Please click here to view a larger version of this figure.
Figure 3: Examples of output graphs relating to automatic VD analysis. (A) Root Mean Square Deviation (RMSD) over the course of 5 ns. (B) Radius of Gyration (Rg) over the course of 5 ns. (C) Root Mean Square Fluctuation (RMSF) over the course of 5 ns. The x-axis of the RMSF graph represents the number of amino acids in the enzyme and the imaging software outputs with a minimum size. If the protein is larger, the x-axis will increase. (D) Energy variation during the energy minimization process. Energy variation ends when the process reaches an adequate value. Please click here to view a larger version of this figure.
Supplementary Figure 1: Simplified flow chart of the process that Visual Dynamics uses to set up a new simulation. Please click here to download this File.
Supplementary Figure 2: Simplified flow chart of the process that Visual Dynamics uses to manage the execution of queued simulations. Please click here to download this File.
Supplementary File 1: Protein pdb file. Please click here to download this File.
Supplementary File 2: Ligand itp file. Please click here to download this File.
Supplementary File 3: Ligand pdb file. Please click here to download this File.
Automating processes is not easy, but it is also less difficult than reprogramming a system from scratch. Gromacs is currently the most popular molecular simulation software, and it is constantly updated. The Department of Biophysical Chemistry at Groningen University initially developed it, and it is now maintained by the Life Sciences Laboratory at the University of Stockholm43.
For any new user, learning simulation techniques is a lengthy journey. VD emerges as an alternative to facilitate this learning process, as well as to validate experiments before investing resources in running simulations for longer durations. Users can execute 5 ns of simulation and thus assess the usefulness of further extending the simulation.
The automation of concatenations and parameter passing to Gromacs commands is only possible through the way Linux provides and allows management of its terminal. One of the main features of VD that enabled full automation can be seen in Figure 3B. In a manual approach, it is necessary for the user to interact at the beginning of execution to choose the groups to be analyzed16. In this case, group 4 (Backbone) was selected as the default. To bypass this step and automate user interaction, the command echo "4 4" is passed to the .gmx command behind the scenes, along with a concatenation tool known as a pipe (|) in the command line itself. Thisis useful for programmers who want to develop a tool similar to VD and allows for automation and emulation of user interaction.
Currently, VD is limited to running simulations with a duration of 5 ns on the WEB server. In the previous version, this limit was only 2 ns39. If a longer simulation time is needed, it is recommended to download the script and MDP files for execution on the user's own machine with Gromacs installed. The options Neutralize System and Ignore Hydrogens are always enabled and cannot be disabled. Similarly, the option Use double precision is always disabled. These options are shown to inform the user. For performance reasons, they are fixed. The current version does not perform simulations with ligands prepared in OPLS or CHARMM force fields. This limitation is purely technical and temporary.
Structurally, VD has two applications: a web front-end, which is the application the user sees, and a server back-end, which is the application that will do the hard work of setting up and running the MD. The web front-end is a simple interface with forms to collect what the users want to do through selections, toggles, and file selectors; the flow chart of the web front-end will be omitted for now, as it is described, step by step in the protocol.
The server back-end is comprised of two services: an application to manage and set up MDs and another to manage and run MDs simultaneously. Supplementary Figure 1 shows the flow taken to set up an MD, and Supplementary Figure 2 shows the flow of the execution queue application.
There are also other automated/web tools for running MD besides Visual Dynamics, each with its own set of pros and cons. ASGARD35, Gmx_qk36, and CHAPERONg37 are excellent tools that help facilitate the usage of GROMACS14 but are still CLI applications that require the user to have a minimum degree of knowledge about the shell. BioBB-Wfs38 is the most similar application to VD39, it also offers a lot of things VD does not, as it uses BioBB44, a library composed of wrappers around popular MD tools. The screens can be a little too complex for someone just starting. The process is quite complex compared to VD. After many steps, a kind of universal component called Common Workflow Language (CWL) is generated for execution on HPCs. However, the user needs to have access to these servers, which is not naturally free.
The current public instance of VD has Gromacs installed to use the processor, but it can be installed with instructions to use the machine's Graphics Processing Unit (GPU), which greatly improves simulation performance45. Furthermore, the command script generated in VD is GPU-transparent, meaning it can be executed on a machine with an installed GPU without needing to modify anything in the commands.
At this point in time, VD is a great tool to introduce MD to someone new to it, but it might be best used by those who are familiar with MD and Gromacs. It can be used in simulation classes, where the professor needs a quick approach without installation or parameter configuration issues to execute an exercise. Additionally, VD can be used to validate a simulation; if it runs for 5 ns, it's justifiable to invest in extending the simulation using the files generated within VD.In general, the group is working on including more automated analyses such as h-bonds and Molecular Mechanics Poisson-Boltzmann Surface Area (MMPBSA) binding energy while also continuously seeking to improve usability.
The authors have nothing to disclose.
This work has been supported by the The Fundação Oswaldo Cruz (Fiocruz), the Fundação para o Desenvolvimento Científico e Tecnológico em Saúde (Fiotec), the Instituto Nacional de Ciência e Tecnologia de Epidemiologia da Amazônia Ocidental – INCT-EpiAmO, the Fundação Rondônia de Amparo ao Desenvolvimento das Ações Científicas e Tecnológicas e à Pesquisa do Estado de Rondônia (FAPERO), the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq).
ACPYPE Server | Bio2Byte | Available at https://www.bio2byte.be/acpype/ | |
GRACE software | Plasma Laboratory at the Weizmann Institute of Science | Available at https://plasma-gate.weizmann.ac.il/Grace/ | |
GROMACS software | GROMACS Team | Installation instructions at https://manual.gromacs.org/current/install-guide/index.html | |
The structure of the FK506-binding protein From Plasmodium vivax complexed with the inhibitor D5 |
RCSB Protein Data Bank | Available at https://www.rcsb.org/structure/4mgv Already contains the ligand complexed to the macromolecule. |